MICHIGAN 


FEB 12 ‘957 


ATH. ECON, 


CANADIAN  tenasy 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. IX - NO. 1 
1957 


New characterizations of polyhedral cones H. Mirkil 
Some global theorems on hypersurfaces Chuan-Chih Hsiung 
A note on the Mathieu groups Lowell J. Paige 
Relative cohomology D. G. Higman 
Some remarks on Noetherian rings Michio Yoshida 
Spaces of dimension zero Bernhard Banaschewski 
Matrices with elements in a Boolean ring A. T. Butson 
Characteristic polynomials Hans Schneider 
Some theorems about p,(n) Morris Newman 
Classes of positive definite unimodular 

circulants Morris Newman and Olga Taussky 
Simultaneous pairs of linear and quadratic 

equations in a Galois field Eckford Cohen 
The set of all generalized limits of bounded sequences 

Meyer Jerison 

A note on asymptotic series H. F. Davis 
On Ward's Perron Stieltjes integral Ralph Henstock 
A generalization of the Cauchy principal value Charles Fox 
On the zeros of the Fresnel integrals Erwin Kreyszig 
A minimum-maximum problem for 

differential expressions D. S. Carter 
A mixed problem for normal hyperbolic linear 

partial differential equations of second order G. F. D. Duff 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 





EDITORIAL BOARD 
H.S. M. Coxeter, A.Gauthier, R.D. James, R. L. Jeffery, 
G. de B. Robinson, H. Zassenhaus 
with the co-operation of 


H. Behnke, R. Brauer, D. B. DeLury, G. F. D. Duff, I. Halperin, 
W. K. Hayman, J. Leray, S. MacLane, P. Scherk, B. Segre, 
J. L. Synge, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Everything 
possible should be done to lighten the task of the reader; the notation 
and reference system should be carefully thought out. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $8.00. This is reduced to $4.00 for individual members of 
recognized Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Universite Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 


National Research Council of Canada 
and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 








won 


Cy: 





466 
rrsy 


NEW CHARACTERIZATIONS OF POLYHEDRAL CONES 
H. MIRKIL 


1. Introduction. A pyramid clearly has all its projections closed, even 
when the line segments from vertex to base are extended to infinite half-lines 
Not so a circular cone. For if the cone is on its side and supported by the 
(x, y) plane in such a way that its infinite half-line of support coincides with 
the positive x axis, then its horizontai projection on the (y, z) plane is the open 
upper half-plane y > 0, together with the single point (0, 0). It is our purpose 
to show that the pyramid behaves better under projection precisely because 
of its polyhedral nature. And this principle can be reinterpreted to give a 
criterion for the positive extendibility of positive functionals defined on a 
subspace of a partially ordered vector space. 

Throughout our discussion E will be a real finite-dimensional vector space 
and E’ its dual space. A subset P of E stable under vector addition and 
multiplication by non-negative scalars is called a convex cone. In particular, 
every linear subspace is a convex cone. The smallest subspace containing P 
is (—P) + P, and the largest subspace contained in P is (—P) (\ P. Omit 
ting parentheses, we shall write —P + P and —P/)\ P. It is customary to 
call dim(—P + P) the dimension of P, and dim(—P () P) the lineality of P 
We define the polar P° of P to be the set of all functionals f © E’ such that 
f(x) > O for all x € P. P® is a closed convex cone, and in fact is the most 
general such cone, since the double polar P°° coincides with the closure of P 
This fact authorizes us to use the notation P°° for the closure of P (provided 
that P is a convex cone). The elementary duality theory of closed convex 
cones can be summed up as follows: 


(1) Galois connection: (P + Q)° = P° (\ Q° and (PM Q)° 

(2) dim(—P + P) + dim(—P° + P°) =dimE 

(3) If P® =. E’, then P = E. 

The above statements for closed convex cones P and Q are easily rephrase: 
for arbitrary convex cones. Proofs can be found in Fenchel (2) 


2. Cones with all projections closed. We shall call a cone polyhedra 
if it is the intersection of finitely many closed half-spaces. A theorem of Wey! 
(3) states that such a cone is also the convex hull of finitely many half-lines, 
and conversely. Thus P and P® are polyhedral together. 


THEOREM. If a closed convex cone P has all its 2-dimensional projections 
closed, then P +s polyhedral. Conversely, a closed convex polyhedral cone P has it 
projections (of all dimensions) closed. 


Received February 22, 1956 











2 H. MIRKIL 


Proof. We first dispose of the converse. Let P be polyhedral and let T 
be a projector. Then P is precisely all non-negative combinations of finitely 
many fixed vectors x;,...,X, and 7P is all non-negative combinations of 
Tx,,..., [Xm. Hence TP is not only closed but even polyhedral. 

Now assume that P and all its 2-dimensional projections are closed and make 
an induction on the dimension n of —P + P, which we can without loss of 
generality take equal to the dimension of the whole vector space E. Since 
there are only 6 linearly inequivalent closed convex cones of dimension 
<2, and since all of these are polyhedral, we can begin with n > 3. 

First suppose that P contains some line L, or equivalently that dim(— P(\P) 
> 0. If T projects P parallel to L into some hyperplane (Z the null space 
of the projector 7), then 7P has lower dimension than P and has all its 
2-dimensional projections closed; hence by induction 7P is the smallest cone 


containing the images 7x,,..., 7x, of some x,...,x, € P. Let x be any 
non-zero vector in L. Because P = P + L, then P is the complete inverse 
image of 7P. Hence P is the smallest cone containing x, —x, x),...,X 


Suppose now dim(—P /)\ P) = 0. The convex cone — P° + P® is dense in 
E’, hence is all of E’, and P® contains nm linearly independent functionals 
whose sum f is strictly positive at all non-zero points of P. Let H be the 
affine hyperplane {x: f(x) = 1}. P is the smallest cone containing H () P, 
and this compact convex set is the convex hull of its extreme points. We shall 
show that these are finite in number by showing that each point x € H(\ P 
has a neighbourhood containing no extreme point other than possibly x 
itself. 

Let Hy be the linear hyperplane through the origin parallel to H and let L 
be the line through x. Let T project the whole space E onto Hy along L. The 
cone 7P is polyhedral by induction, hence consists of all non-negative com- 
binations of some y:,...,y, © Ho. We can without loss of generality take 
these as images under T of 


WNt+x,....% txe HOVP. 


(For if they are originally the images of y; + cix,..., yr +c,x € P, then 
we can replace y;,..., y, by ¢:~"y1, . . . , ¢-~'y,.) 

Furthermore, near 0, 7P coincides with the convex hull of 0, y;,..., y,. 
We assert that near x the set H(\ P coincides with the convex hull of x, 
X1,...,X,. For clearly any x’ € H(\P near x has its image Tx’ near 0, 
hence Tx’ = ¢yyi + ... + ¢,y, with all c’s non-negative and 0 < ¢)+...+¢,< 1. 
And since the restriction of T to H is a one-one affine mapping of H onto Ho, 
it follows that x’ is the convex combination 


(l—c—...—¢,)x + eK, +... + .6,%,. 


Now H(\ P is seen to be a convex polyhedron near x, and near x the only 
possible extreme point of H()\P is x itself. Thus 7 ()\ P coincides near 4 
with the convex hull of x, x1,...,x,. 


Al 


the « 


at m 
hull 


and 


3. 
inter 
goor 
com 
ibilii 
pape 

G 
veclé 
>0 
Whe 
eacl 


con 


cone 
vec! 
non 
eXal 
posi 
hert 

C 
sub; 
exte 
by € 
all : 
hed: 

PF 
The 


or ¢€ 


I 


D 


CHARACTERIZATIONS OF POLYHEDRAL CONES 3 


An application of the Heine-Borel theorem completes the proof. We cover 
the compact convex set H (\ P with finitely many open sets, each containing 
at most one extreme point of H ()\ P. The whole set H (\ P is then the convex 
hull of finitely many points, P is the convex hull of finitely many half-lines, 
and the theorem is proved. 


3. Positive extendibility of positive functionals. The main practical 
interest of our theorem lies in Corollary 1 below. De Leeuw (1) uses it, and a 
good deal more, in proving a convexity theorem for polynomials in several 
complex variables. And in fact it was a question of his about positive extend- 
ibility of functionals that led us to conjecture the theorem of the present 
paper. 

Given a convex cone P that contains no line, we can make E an ordered 
vector space by defining x > y to mean x — y © P. Conversely the vectors 
>0 in an ordered vector space form a convex cone P that contains no line. 
When such a cone is closed it is called an order cone. And it is easy to prove that 
each of the following properties characterizes order cones among all closed 
convex cones P: 

(1) P possesses extreme half-lines (though certain authors define extreme 
half-lines in such a way that this particular characterization has trivial 
exceptions). 

(2) P is the convex hull of its extreme half-lines 

(3) P lies strictly to one side of some hyperplane. 

(4) P consists of all half-lines through some compact convex set that does 
not contain the origin. 

Our purpose in this section, however, is to characterize polyhedral order 
cones amony all order cones. The simplest kind of polyhedral ordering of a 
vector space is the coordinatewise ordering relative to a basis, a vector being 
non-negative if and only if all its coordinates are non-negative. A natural 
example of non-polyhedral (and non-lattice) ordering is given by the cone of 
positive-definite matrices in the real m?-dimensional space of n-by-n complex 
hermitian matrices. 


COROLLARY 1. Let E be a polyhedrally ordered vector space, and let F be a 
subspace with the induced ordering. Then every positive functional f on F can be 
extended to a positive functional on E. Conversely, let E be a vector space ordered 
by a closed cone in such a way that the above positive extension property holds for 
all subspaces F and positive functionals f on F. Then the ordering of E is poly- 
hedral. 


Proof. Let Ty be the natural mapping of E’ onto F’, with null space F° 
The condition for positive extendibility of functionals can be written 
Tr(P°) = Tr((PC\ F)®) = Tr((P° + FY” 
or equivalently 


pe 4 F° = (P° + Feyee + F° = (Pp? + Fre)2° 











4 H. MIRKII 


Thus the condition asserts that all projections of P° are closed, and we know 
by our theorem that this happens exactly when P°® is polyhedral. But P and 
P® are polyhedral together. 


COROLLARY 2. Let F be a subspace of E, let P be a polyhedral cone in EF 
containing no line, and let Ky be a half-space of F containing F (\ P. Then 
K; = FC\Keg, for some half-space of E containing P. Conversely, let P be a 
closed convex cone containing no line, and suppose that some half-space K, 
of E like the above can be found for every choice of F and Ky. Then P is polyhedral 


Proof. We have simply restated Corollary | in a geometric language that 
avoids explicit mention of functionals. 


REFERENCES 


1. K. de Leeuw, A type of convexity in the space of n complex variables, Trans. Amer. Math. Soc 
(to appear). 

2. W. Fenchel, Convex cones, sets, and functions, Princeton lecture notes (mimeographed 
1953. 

3. H. Wevl, Elementdre Theorie der konvexen Polyeder, Comm. Math. Helv. 7 (1935), 290 


Dartmouth College 
Hanover, N.H. 


”) 


SOME GLOBAL THEOREMS ON HYPERSURFACES 


CHUAN-CHIH HSIUNG 


1. Introduction. The purpose of this paper is to establish the following 
theorems, which were obtained by Hopf and Voss in their joint paper (2) for 
the case where n = 2. 


THEOREM 1. Let V", V*" be two closed orientable hypersurfaces twice differen- 
tiably imbedded in a Euclidean space E"*' of dimension n + 1 > 3. Suppose 
that there is a differentiable homeomorphism between the two hypersurfaces V", 
V'" such that the orientations of the two hypersurfaces V", V*" are preserved 
and the line joining every pair of corresponding points P, P* of the two hyper- 
V*" is parallel to a fixed direction R, and such that the two hyper- 
have equal first mean curvatures at every pair of the points P, 
P* but no cylindrical elements whose generators are parallel to the fixed direction 
R. Then the two hypersurfaces V", V*" can be transformed into each other by a 
translation. 


surfaces V", 


surfaces V", V*" 


A closed hypersurface V" imbedded in a Euclidean space £"*' of dimension 
n + 1 > 2 is said to be convex in a given direction, if no line in this direction 
intersects the hypersurface V" at more than two points. It is obvious that a 
closed hypersurface V" is convex in the usual sense if it is convex in every 
direction in the space E"*'. 


THEOREM 2. Let a closed orientable hypersurface V" twice differentiably 
imbedded in a Euclidean space E"*' of dimension n + 1 > 3 be convex in a 
given direction R. If the two first mean curvatures of the hypersurface V" at every 
pair of its points of intersection with the lines in the direction R are equal, then the 
hypersurface V" has a hyperplane of symmetry perpendicular to the direction R. 


Theorem 2 can easily be deduced from Theorem |. In fact, let u be a mapping 
of a hypersurface V" satisfying the conditions of Theorem 2 onto itself such 
that the two points of intersection of the hypersurface V" with any line in the 
direction R are mapped into each other. In particular, if a line in the direction 
R is tangent to the hypersurface V” at a point P, then uP = P. Let r be the 
reflection with respect to an arbitrary hyperplane perpendicular to the 
direction R, and P any point of the hypersurface V". Then the mapping 


ruP = P* maps the hypersurface V" onto the hypersurface V*" = r(V") 
generated by the point P*, and the two hypersurfaces V", V*" satisfy the 
conditions of Theorem 1 so that ru = ¢ is a translation. Therefore u = rt 


is a reflection with respect to a hyperplane perpendicular to the direction XK, 
and hence Theorem 2 follows. 


Received April 7, 1956 











i CHUAN-CHIH HSIUNG 


By noting that a closed hypersurface " imbedded in a Euclidean space 
E"*' of dimension n + 1 > 2 must be a hypersphere if it has a hyperplane of 
symmetry perpendicular to every direction in the space E"*', we arrive 
readily at the following known result from Theorem 2 


COROLLARY. A closed convex hypersurface V" of constant first mean curvature 
twice differentiably imbedded in a Euclidean space E"*' of dimension n + | > 3 
is a hypersphere. 


THEOREM 3. Let V"(V*") be an orientable hypersurface with a closed boundary 
V"-!(V*"—') of dimension n — | > | twice differentiably imbedded in a Euclidean 
space E"*' of dimension n + 1. Suppose that there is a differentiable homeo- 
morphism between the two hypersurfaces V", V*" with the same properties as 
those of the homeomorphism in Theorem 1. 

(i) If the two boundaries V"—', V*"—' are coincident, then the two hypersurfaces 
Vv", V*" are coincident. 

(ii) If the two normals of the two hypersurfaces V", V*" at every pair of 
corresponding points, under the given homeomorphism, of the two boundaries 
v"-!, V*"-! are parallel, then the two hypersurfaces V", V*" are transformed into 
each other by a translation. 


2. Preliminaries'. [nn a Euclidean space E"*' of dimension » + | > 3 


let us consider a fixed orthogonal frame O/,... J/,4; with a point O as the 
origin. With respect to this orthogonal frame we define the vector product ol 
n vectors Aj,...,. A, in the space E"*' to be the vector A,4:, denoted by 
A, X...XA,, satisfying the following conditions: 


(a) the vector A,,4; is normal to the n-dimensional subspace of /£”*' deter- 
mined by the vectors A;,..., Ay, 

(b) the magnitude of the vector A,4,; is equal to the volume of the parallele 
piped whose edges are the vectors A;,..., An, 

(c) the two frames OA,...A,A,nai and O/,...J/,4,; have the same 
orientation. 


Let o be a permutation on the m numbers I|,... , ”, then 
(2.1) fan KM... Khan = Ge) A, M...% ae 
where sgn o is +1 or —1 according as the permutation o is even or odd. 
Let 7, ..., %,4; be the unit vectors from the origin O in the directions of the 
vectors J;,...,I,4; and let AZ (j = 1,...,”+ 1) be the components? of 
the vector A. (a = 1,..., n) with respect to the frame O/,... /,4:, then 


the scalar product of any two vectors A, and Ag and the vector product of # 
vectors A;,..., A,, are, respectively, 


'For this section see, for instance, (3, pp. 287-289). 
*T hroughout this paper al! Latin indices take the values | to » + | and Greek indices the 


values | to m unless stated otherwise. We shall also follow the convention that repeated indices 
imply summation 


GLOBAL THEOREMS ON HYPERSURFACES 


‘ 
n+l 
(2.2) 1.°Ag = Log 
i= 1 
ty) te » batt 
» |Ai Ai... AT™ 
(2.3) 1; X - XA, = (—1) 
\! {° gett 
If Ag are differentiable functions of variables x',... , v", then by equation 
(23) and the differentiation of determinants 
0 
(2.4) mw (45%... K Ag) 
Ox 
“ OA, 
=> AX... X Age KX HS K Ages KX... K Ag). 
=I Ox 
Now we consider a hypersurface | twice differentiably imbedded in the 
space E"*'! with a closed boundary V"~' of dimension nm — 1. Let (y',..., y"*") 
be the coordinates of a point P in the space E"*' with respect to the orthogonal 
frame OJ, .. . Ingi. Then the hypersurface V" can be given by the parametric 
equations 
(2.5) ee i (= 1,...,# +1), 


or the vector equation 


(2.6) Y =m P(g", . 2. -. 


where y‘ and f‘ are respectively the components of the two vectors Y and F, 
the parameters x',...,x" take values in a simply connected domain D 
of the n-dimensional real number space, f‘(x', ... , x") are twice differentiable 
and the Jacobian matrix ||dy‘/dx*|| is of rank n at all points of the domain D. 
If we denote the vector 0Y/dx* by Y. (a = 1,...,m), then the first funda- 
mental form of the hypersurface V" at the point P is 


(2.7) ds? = gag dx* dx*, 

where 

(2.8) gop = Va: Yo, 

and the matrix ||gag|| is positive definite so that the determinant g = |gag| > 0 


Let N be the unit normal vector of the hypersurface V" at the point P, 
and N, the vector dN/dx*, then 


(2.9 Na = — bes 2°" Vy, 
where 
(2.10) ee? ee at A 


are the coefficients of the second fundamental form of the hypersurface 
V" at the point P, and g* denotes the cofactor of gs, in g divided by g so that 











CHUAN-CHIH HSIUNG 


‘ ag a 
(2.11) 


(the Kronecker deltas). The principal curvatures «; ,k, Ol the hyper- 
surface V" at the point P are the roots of the determinant equation 
(2.12) bag — kgas| = 0, 


from which follows immediately the first mean curvature of the 
V" at the point P 


(2.13) M, = — > ke bas g 
nN 


he area element of the hypersurface V" at the point P is 
| I 


(2.14 dA = g' dx' A dx’ 


hypersurlace 


where the operator d is the exterior differentiation, and the wedge denotes 
the exterior multiplication. Now we choose the direction of the unit normal 
vector N in such a way that the two frames PY,... Y,N and O/ l 


have the same orientation. Then from equations (2.3) and (2.14) it follows that 
(2.15 gN=Y,X. ap 
(2.16 SES: Y,, Ni =¢g 


where the left side of equation (2.16) is a determinant indicated by writing 


only a typical row. 


3. An integral formula. Let V" be an orientable hypersurface with 
closed boundary |"~' 


al 
of dimension n 1 > 1 twice differentiably imbedded 
in a Euclidean space E"*' of dimension n + 1, and suppose that the hyper- 
surface 1” is given by the vector equation (2.6). Let J be the unit vector in 
fixed direction R in the space E"* 


«lt 
, and w a twice differentiable function over 
the hypersurface V”. Then §2 can be applied to the hypersurface 1, and we 


shall use the same symbols with a star for the corresponding quantities for the 


hypersurface V*" defined by the vector equation 
(3.1) y* Y + W, 
where 
(3.2 W = wl 

Let 2" (a = 1,...,m) be m vectors in the space £"*', and suppose that the 
components of each vector 2° with respect to the orthogonal frame O/, » es 
are differentiable functions of the variables  x', , x". In order to derive 


an integral formula for the two hypersurfaces V", V*" we use the vector 
product of vectors and the exterior multiplication of differentials to define the 
vector 


o« 1 a—1 a , " 
(3.3) 2 @®...@Q Q® dX ®@®... @ dQ 


Or 


M 


GLOBAL THEOREMS ON HYPERSURFACES v 


lor a - , nN, where 


a * a “ , 
Ve = Of /ox* 
a 


It is obvious that the vector (3.3) is independent of the order of the vectors 


daz,..., dQ". Thus from equations (2.9), (2.13), (2.14), (2.15) we obtain 


(3.4) dY@...@dY=n!(¥i:X...X Ya)dx' A... A dx’ n'N dA, 
(3.5) dY@®@...@dY@dN 


= (n — (> ht ae — ore 2 ee ae ee V,) dx" Ai dx 
a=! 
n!' M,N dA. 


Making use of equations (3.1), (3.2), (3.4) and its analogue for the hypersurtace 
*", and noting that 


dW®...@dW@Q@dY ® @ dy) (), 
* factors n~—a@ factors 
dwe®...@dW @dyY*®...@dyY"* 0 
(a factors factore 
lora >vZal 
W, ¥s,..., ¥ol = |W. M1..-...% 


we are easily led to 
3.6) (nm — 1)! (V*dA Vd.l1) =dW@dVY®...@d) 

= dW @dY*®...@dyY* 
(3.7 W-N dA = W-N*dA 


(3.8) |W,N*,Ys,..., ee a 
_A gy a eS ee 2 (a= 1,...,m 
From equations (2.3), (3.3), (3.5), (3.6) it follows immediately that 


(3.9) WAN @dY®...@dY) 


(—1)"(n my WN, Y:, ae aoe « ae 
7 / \ dx" A dx" A... A dx", 
(3.10) d|W-(N @dY®...@dY)| 
—N-(dW@dY®...@dY)+W-(dN @dV®@...@d) 
~—n!M,W-N dA — (m — 1)! (N-N* dA* — dA). 


Similarly, in consequence of equations (3.6), (3.7), (3.8) and those analogous 
to equations (3.5), (3.9) by changing the vectors Y, N to the vectors Y*, N* 


respectively , we obtain 











10 CHUAN-CHIH HSIUNG 


(3.11) d[W-(V* @dY @... @dY)] = d|W-(N* @d¥*®@... @dY*)| 
— N*-(dW @ dY¥* @... @dY*) + W-(dN* @ dY*®... @dY*) 
—n! Mi W-NdA — (n — 1)! (dA* — N*-N dA). 


Thus, from equations (3.9), (3.10), (3.11), 


(3.12) d>. |W.N — N*,¥:,..., Yo-1, Yost ---, y. 
a=! 
dx' aaa oe dx* ' A dx*" tA a dx" 
= Am dW: (N @dY®@...@dY) — W:(N* @dY®... @4dY)| 


= (—1)"[n(My — Mi) W-N dA + (1 — N-N*) (dl + dA")]. 


Integrating equation (3.12) over the hypersurface 1” and applying the 
Stokes’ theorem to the left side of the equation, we then arrive at the integral 
formula 


dx'/ ° -_A dx* A dx"" : 7 « \ dx 


- (-1)" | [n(M, — Mi) W-N dA + (1 — N-N*) (dd + dA")I. 
v" 


In particular, when the hypersurface | is closed and orientable, the integral 
on the left side of equation (3.13) vanishes and hence 


(3.14) nf (Mi — M,) W-N dA + | (1 — V-N*) (dA +dA*) =0. 
v" Jyn 


4. Proof of Theorems | and 3. It is easily seen that we can apply the 
results in §3 to two hypersurfaces 1", V*" satisfying the assumptions of 
Theorem 1. Since MT = M, at every pair of corresponding points of the two 
hypersurfaces V", V**", the formula (3.14) becomes 


(4.1) j (1 — V-N*) (dA + d.Al*) = 0. 
Tim 


But dA > 0, dA* > 0 and | — N-N* > 0 due to the fact that N and N* 
are unit vectors. Thus the integrand of equation (4.1) is non-negative, and 


therefore equation (4.1) holds when and only when | — N-N* = 0, which 
implies that 
(4.2) N* = N. 

Now in the space £"*+! we choose the orthogonal frame OJ, .. . J,41, with 
respect to which a point in the space E"*' has coordinates y',..., y"*', in 


such a way that the unit vector /,,, is the fixed unit vector 7. Since the hyper- 
surface V" has no cylindrical elements whose generators are parallel to the 
fixed vector J, the closed set M of all points of the hypersurface 1”, at each 


GLOBAL THEOREMS ON HYPERSURFACES 1] 


of which the y“*'-component of the unit normal vector N of the hypersurface 


| is zero, has no inner points and therefore the open set V" — M is everywhere 
dense over V". Thus, in neighborhoods of any point of the set ’" — M and 
its corresponding point on the hypersurface !’*", y',..., y" are regular 


parameters of the two hypersurfaces V", V’*" so that the hypersurfaces V", V’*" 
can be represented respectively by the equations 


n+l 


y ao tide: ee a 


(4.3) 


ft a SPs... gt Oe at y") + w(y',..., y"). 


By means of equations (2.15), (4.3) we obtain the unit normal vectors NV, N* 
of the hypersurfaces V", V’*": 


" q n+l ° n+ 
' ' oy oy" 
(4.4) N=-—g ( — re oe — g* ( <——— ktoal 
g p> ay 1 g > ay 1 


from which and equations (4.2), (4.3) it follows immediately that in a neigh- 


borhood of any point of the set V" — M, 

4. antl ja @ a tl jo @ 

oy dy = Oy Oy LS er 
and the function w is constant. Thus dw/dy* (a = 1, n) are zero in the 
everywhere dense set 1” — M and therefore on the whole hypersurface |” 


by continuity. Hence the function w is constant on the whole hypersurface 
1", and the proof of Theorem | is complete. 

In both parts of Theorem 3 the integral over the boundary V"~' on the 
left side of the formula (3.13) also vanishes, since over the boundary V”" 
W = 0 and N* = N in the two parts respectively. By the same argument 
as that in the above proof of Theorem 1, we therefore obtain between the 
two hypersurfaces V”, V*" a translation, which in part (i) reduces to an identity 
Hence Theorem 3 is proved. 

Now suppose that in Theorem 3 the fixed direction & is along the vector 


/,4, and the hypersurfaces V", V*" can be represented by equations of the 
ey oe or (s*,.. 2. y"). Then part (i) of Theorem 3 can be stated as 
follows: The problem of finding a function y"*'(y',..., y") over a bounded 
region in the space (y',..., y") with given boundary values such that the 


irs ean curvature M, of the hypersurface V" defined by the equation 
first m t M, of the hy f V" defined by tk juat 


+1 


) PE bs 4 os y") is a given function M,(y',..., y") admits at most one 


solution. Making use of equations (2.10), (2.13), (4.4) and 


we can easily obtain the first mean curvature of the hypersurface ’", namely, 


(4.5) M,=n gg —s 











12 CHUAN-CHIH HSIUNG 


Thus the above special case of part (i) of Theorem 3 is a consequence of the 


well-known uniqueness theorem for the solutions of elliptic differential equa- 


tions of the second order, since the determinant |g**| = 1/g > 0. 

5. Connection with symmetrizations. Let y', , y"*! be the co- 
ordinates of a point with respect to a fixed orthogonal frame OJ, .. . In: 
in a Euclidean space £"*'! of dimension n + 1 > 3, and let a closed orientable 


hypersurface V" twice differentiably imbedded in the space E"*' be convex 
in the direction of the vector /,,;. Let P be any point of the hypersurface 
V", and P* the other point of intersection of the hypersurface V” by the line 
/ through the point P and in the direction of the vector J/,,,. If the line / is 
tangent to the hypersurface V", then the point P* coincides with the point P 
Let y"t', y*"*' be respectively the (m + 1)th coordinates of the points P, P* 
with respect to the frame OJ, . I,4;, and Mf, N* the first mean curvature 
and the unit normal vector of the hypersurface V" at the point P*. 

The Steiner's symmetrization of the hypersurface V" with respect to the 
hyperplane y"t! = 0 is a geometric operation by which any point P of the 
hypersurface V" goes into a point P’ on the line / with 


y/*t! = A (y"*! —_ y*" +1) y’ +1 b(y"t! 4 y*" +1 
as its (w + 1)th coordinate with respect to the frame O/, .. . /,4,. In the time 
interval 0 < ¢ < 1, we shift the segment PP* along its line / into the position 
P’P* such that the (m + 1)th coordinates of the points P’, P*’ with respect 


to the frame OJ, .. . /,4; are respectively given by 
- n+1 n+1 t n+ en+l gon+l antl t n+1 1 
Gi) fs: 7" =F -sG +7") 9 =y -=-(y" +9 


That is, the segment PP* is shifted with uniform velocity into the position 
where it is bisected by the hyperplane y’*! = 0. This transformation 77, is 
called the coitinuous symmetrization of Steiner.* 7) is the identity and 7) 
results in a complete symmetrization. It is obvious that the transformation 
T, leaves the volume of the hypersurface V’" unchanged 

Now let us consider a neighboring hypersurface V",., of the hypersurface 
lV" defined by the vector equation 


(5.2) y" = V¥Y+eaw-Nn) N, 

where ¢ is an infinitesimal, Y is the position vector of the point P of the hyper- 
surface V" with respect to the frame O/,.. . J,4;, and 

(5.3) W = wl, w= Pts on net, 


An elementary calculation and the use of equations (5.2), (2.8), (2.9) yield 
the coefficients of the first fundamental form of the hypersurface V’ 


‘For the continuous symmetrization of Steiner in a Euclidean space E" of dimension n = 2, 
see (1, pp. 249-251; 4, pp. 200-202) 


GLOBAL THEOREMS (N HYPERSURFACES 13 


(5.4) gus = Las — 2e(W-N) bas + (O)(€), 
and therefore 
(5.5) g* = |gas| = g — 2ne(W-N) Mig +4 


where the omitted terms are of degrees >2 in e. From equations (5.5), (2.14 


follows immediately the area of the hypersurface V";, 
(5.6) A” = j J/g * dx' A... A dx" =A — ne | M,(W-N)dA +4 
| ae | 


Thus we obtain 


“y « . 
(5.7) (: : ) == § | M,(W-N)dA 
de ex wy" 


Similarly, replacing equation (5.2) by Y‘* = Y* + e(W*-N*) N* gives 
a aA‘ ‘ ae * 
(5.38) ( - =—n M, (W -N )dA 
GT: <0 Jy 
Noting that y*"t! = — y"*! — w, W* = W and making use of equation (3.7 


we obtain immediately 


(5.9) W* -N* dA®* = W-NdA, 


and therefore equation (5.8) becomes 


(5.10) (a4 ) =n M, (W-N)d.1. 
de e=0 J 


Thus the addition of equations (5.7), (5.10) gives 


4) e . 
(5.11) (a4 ) - a | (Mi — M,) W-N dA. 
o€ ex) m Wy} 


\s in the proof of Theorem 2 in §1, we consider the reflection r with respect 
to the hyperplane y"*' = 0. By this reflection r the point P* of the hypersur- 
face V" goes into the point P* defined by 


(5.12) Y*=V+W, 


which generates a hypersurface V*". If equation (5.12) is used instead of 
equation (3.1), then the formula (3.14) becomes 


(5.13) mn | (Mm? — M,)W-NdA + | (1 — V-N") dA + dA”) =0, 
J yn J yr 


where \* and dA* are respectively the unit normal vector and the area element 
of the hypersurface V*" at the point P*. By interchanging the corresponding 
quantities of the two hypersurfaces V”", V*" at the two points P*, P* respect- 


ively it is easily seen that 


(5.14) (1 — V-N*)dA* = (1 — N*-V)dA 











14 CHUAN-CHIH HSIUNG 


By means of equation (5.14), equation (5.13) reduces to 


= 
9 


(5.15) 


j (Mi — M,)W-NdA = - j (dl — N-N*)dA, 
yn J yn 


from which and equation (5.11) we therefore obtain 


(5.16) (<=) =- | (1 — N-N*)dA. 
de ex0 J yn 


Making use of equations (5.11), (5.15), (5.16) we can easily reach the following 


conclusion : 


If MT = M, at every point P of the hypersurface V", then (0A /de)..0 = 0 
and the hypersurface V" is symmetric with respect to a hyperplane. If the hyper- 
surface V" is not symmetric with respect to a hyperplane and X* # N at every 
point P of the hypersurface V", then (d0A“®/d¢€)..0 < 0. 


REFERENCES 


1. W. Blaschke, Vorlesungen tiber Differentialgeometrie (Vol. 1, 3rd ed., Berlin, 1930). 

2. H. Hopf and K. Voss, Ein Satz aus der Flichentheorie im Grossen, Archiv der Mathematik 
3 (1952), 187-192. 

3. C. C. Hsiung, Some integral formulas for closed hypersurfaces, Math. Scandinavica, 2 (1954), 
286-294. 

4. G. Pélya and G. Szegé, Isoperimetric inequalities in mathematical physics (Ann. Math 
Studies, No. 42, Princeton, 1951 


Lehigh University 


Vv 


A NOTE ON THE MATHIEU GROUPS 
LOWELL J. PAIGI 


1. Introduction. The principal result of this paper is the representation 
of the Mathieu group M2; as a groun of 11 X 11 matrices over the Galois 
Field GF (2). This is a new represeniation of M2; and in §5 an indication of 
how the techniques of this result might be extended to the Mathieu group 
M,, is given. 

The results of §3 were essentially obtained by Professor E. Spanier while 
investigating another problem and it was during conversations with him 
that the present result was observed. 


2. Steiner systems and the Mathieu groups. A Steiner system 
S(p, g, 7), with p < q <r, is defined on the r integers 1, 2,... , y and consists 
of (5)/($) subsets 17, of q integers each with the property that any arbitrary 
set of p integers is contained in one and only one of the subsets H,. For example, 
the Steiner system S(2,¢g + 1,q*+4q+ 1) (¢ a prime) can be constructed 
by considering the points and lines of a finite projective plane with g + | 
points on each line. 

The group G of a Steiner system S(, ¢, 7) consists of all those permutations 
of the symmetric group S, that permute the subsets H, among themselves. 
Witt (1, p. 274) has shown that the Steiner systems S(4, 5, 11), S(5, 6, 12), 
S(3, 6, 22), S(4, 7, 23) and S(5, 8, 24) are unique (i.e., for fixed p, g, r, there 
exists a permutation of S, carrying S,(p, g, r) into S.(p, g, r)) and the groups 
associated with these Steiner svstems are the Mathieu groups My, My, 
Mo, Moz, and Mo, respectively 


3. Generation of S(4, 7, 23). Let V(m) be the vector space consisting ol 
all n-tuples (x, X2,...,X,), with each x, contained in the Galois Field GF (2), 
under the: usual definitions of addition and scalar multiplication. 

The distance d(x,y) between two vectors x = (x),%2,...,%,) and 
y = (91, ¥o,---,¥n) Of V(m) is defined to be the number of coordinates for 
which x, # y, (¢ = 1,2,...,). 

A subset S(r, 2) of V(m) is defined to be an exact r-covering of V(m) if and 
only if 


(i) For every vectorx € V(m), min {d(x,s)} <r; 


seS(r.n) 


(ii) For s;, ss € S(r, n), d(s;, Se) > 2r. 


For a fixed vector s of an exact r-covering S(r,m), the number N(r, 2) 
of vectors x € V(m) and satisfying d(x, s) < r is obviously 


Received April 7, 1956 








16 LOWELL J. PAIGE 


(3.1) vis) = (*)4+(")4...4(*). 
0 1 r 


The number of vectors in an exact r-covering S(r,”) is clearly 2"/N(r, n) 
since the equation d(x, s) < r can have but one solution s in S(r, m) for every 
vector x of V(m) 

It may be possible for S(r, m) to be a linear subspace of V(m). Certainly a 
necessary condition for such a possibility is that N(r,m) divide 2”. In the 
case that m = 23 andr = 3 we have N(3, 23) = 2'' and in the following lemma 


an exact 3-covering of V (23) is obtained that is a linear subspace. 


LEMMA 3.2. Let R be the subspace of V(23) generated by the rows of the 
following rectangular array with elements in GF (2): 


(3.3 L0OVV0GCVDVD0O000LLILOOLILOLOO 
QOLOVOVODVDVVOTLILILOLOLOLOLO 
ODOLOVDDODVODODVOOVOOLILLOOLOLLOOI 
QOVOD0LODQDDOVQODOOOLIOLIOOOLILIOI 
OVDD0DOLOVDVDDOOOLTIOLOLIOOOIL 
PN0000100000110011010110 
ODDDDDLOVDOOLOLLIOOLOOI!I 
OVVD0VDD0O0LOOOLOLLOLOOLIIO 
OVDV0D0D0D000TO00LTOLOLILIOOIOI 

OVDVDDVDDDDOOLTOLTOOLLIILILO0OO 

OD0DVDVODVOD00001TLOO00O00TLIIINI 


The linear subspace T orthogonal to R is an exact 3-covering of V (23). 


Proof. The crucial part of the proof is the verification that no six columns 
of (3.3) are linearly dependent. The usual statement at this point regarding 
straightforward computations would be inappropriate. However, the verifica- 
tion was accomplished on the high speed computer SWAC, and the computa- 
tions, of necessity, will be omitted. 

Assuming the result of the previous paragraph, the proof proceeds by noting 
that, for any two vectors ¢; and fy of 7, d(t;, f2) > 6, since otherwise there 
would exist a subset of six columns of (3.3) that would be linearly dependent. 
Now a simple numerical calculation (i.e. 2'' .2'? = 2%*) shows that T is an 
exact 3-covering of V(23). 

We now proceed to generate S(4, 7, 23). Let H, be the set of all integers j 
such that ¢, # 0 for the vector ¢ = (t), fo,....,¢,) of the linear subspace T 
of Lemma 3.2. 


THEOREM 3.4. The set of all sets H, containing seven integers forms a Steiner 
system S(4, 7, 23). 


Proof. T is an exact 3-covering of V(23) and since no six columns of (3.3 
are linearly dependent it is clear that every vector of (23) with 4 non-zero 








0 





A NOTE ON THE MATHIEU GROUPS 17 


coordinates must be at distance 3 from those vectors of 7 with 7 non-zero 
coordinates. Each vector of T with 7 non-zero coordinates is at distance 3 
from (3) vectors with 4 non-zero coordinates and hence there are (73) /(3) = 253 
vectors in T with 7 non-zero coordinates. 

An arbitrary set of 4 integers cannot be contained in H,, and H,, (containing 
7 integers) if 4; # ft, because this would imply that there would be 6 linearly 
dependent columns of (3.3). This completes the proof of Theorem 3.4 and 
establishes the existence of a Steiner system S(4, 7, 23) 


4. A matrix representation of M.,. In this section an 11 & 11 matrix 
representation over GF (2) will be obtained for the Mathieu group M23. 

Let T be the exact 3-covering of V(23) obtained in Lemma 3.2, and let 
S(4, 7, 23) be the Steiner system consisting of the sets 7, constructed in 
Theorem 3.4. The following vectors of T are linearly independent and generate 
r 

,; = ODODDDODOLILILTILLO000000000) 
2 =(11111100000010000000000) 
fs = (11100011100001000000000) 
tg = C1OOLIOLILOLOOVOVOLOODVOVOIODODDO) 
f, =~ (1LOLOLIOLILOODOVODOLODVOODVODO) 
fg = (0101TL1LO01110000001000000) 
fj = (111001000111000000100000) 
fs = (10100110011000000010000) 
tg = (01110001011000000001000) 
foo = 10010101101000000000100) 
fo = (001001111001000000000010) 


fe = OOLLILOILOIO!N 00000000000 1) 
These vectors yield subsets H,, (¢ = 1,2,..., 12) of S(4, 7, 23) and since 


M>»; consists of those permutations of the symmetric group ©»; that transpose 
the subsets H, of S(4, 7, 23) among themselves, it is possible to consider M, 
as the group of all 23 XK 23 permutation matrices Q which leave T invariant 
Thus a representation of M2, is induced on the space T and, since each Q is 
orthogonal, on the space R orthogonal to 7. 


THEOREM 4.1. The representation p of M2; induced on R is an isomorphism 


Proof. The kernel of p consists of those permutations Q that leave the 
array (3.3) invariant. Since no two columns of (3.3) are the same, the kernel 
of p is the identity. 

We thus obtain an 11 X 11 matrix representation over GF(2) for M 
and it is of interest to note that this representation is irreducible. The referee 
has suggested the following simple proof: If p were not irreducible, there would 
exist an invariant subspace S of R of dimension k with 0 < k < 11. Then 
p(M.:) would be a subgroup of the group of all non-singular transformations 














18 LOWELL J. PAIGE 


of R leaving S invariant. However, the order of p(M2;) is divisible by 23 and 


the order of the group of all non-singular transformations leaving S invariant, 


il 


P 
ok(ll—k ot , 1 attinnd , ’ 
2 Tl 2-2") [T] @ | 


i=! l 


is not. This argument also proves that ./.; does not have a faithful representa- 
tion by k X k matrices over GF (2) for any k < Il. 


5. Comments and generalizations. |i the field of coefficients in 
§3 of V(n) is allowed to be the Galois field GF(p*), the same definitions ol 
distance and exact r-covering S(r,m) yield the fact that there are 


N(r, n, p*) = (") + (p - 1)-(") +...4+(p*- 1)"-(*) 


vectors of V(m) that satisfy d(x,s) <r fors € S(r,n). 

In the case that p = 3,k = 1, r = 2 we find that N(2, 11, 3) = 3°; as in 
§3, it is possible to construct a 5 X 11 rectangular array over GF (3) such that 
no 4 columns are linearly dependent. The subsequent analysis of the orthogonal 
space in V(11) over GF(3) leads to a Steiner system S(4, 5, 11). 

It had been hoped that the matrix representations of the Mathieu group 
obtained in this paper might lend itself to the determination of a simple set of 
generators of M.;; unfortunately, this aim has not been realized. 

It should also be pointed out that the divisibility of (p*)" by N(r, n, p*) 
although necessary, is not sufficient to ensure the existence of an exact 
r-covering of V(m). For example, NV (2, 90, 2) = 2'*, yet a simple analysis of 
vectors having three non-zero coordinates shows that no exact 2-covering 
can exist. Here again the techniques of the present note become hopelessly, 
involved in combinatorial analysis if one attempts to find new Steiner svstems 
or simple groups. 


REFERENCI 


1. E. Witt, Ueber Steinersche Systeme, Abhand. Math. Sem. Univ. Hamburg, /2 (1938), 
265-275. 


University of California at Los Angeles 





i- 


n 
| 


RELATIVE COHOMOLOGY 
D. G. HIGMAN 


lt is our purpose in this paper to present certain aspects of a cohomology 
theory of a ring RX relative to a subring S, basing the theory on the notions 
of induced and produced pairs of our earlier paper (2), but making the paper 
self-contained except for references to a few specific results of (2). The co 
homology groups introduced occur in dual pairs. Generic cocycles are defined, 
and the groups are related to the protractions and retractions of R-modules 
Our cohomology groups are modules over the center of R, and in the final 
section we record some facts concerning their annihilators. Attention is given 
to the case in which R is a self dual S-ring in the sense of (2). Applications of 
the theory to the study of orders in algebras will be found in (4), where, in 
particular, results of (3) are generalized. 

Since this paper was first submitted, Professor Hochschild has kindly) 
given the author the opportunity of seeing the manuscript of his paper (8) 
in which the methods of Cartan-Eilenberg (1) are generalized to give a theory 
of relative homological algebra. The present paper (as well as (2)) has some 
results in common with the book of Cartan-Eilenberg and overlaps to som: 
extent with the paper of Hochschild; we have indicated some of the relations 
in footnotes. Our point of view and methods differ rather widely from 
Hochschild’s. 

We are indebted to the referee for suggestions simplifying the notation 
and increasing the generality somewhat, and for the references to (1) 


1. Induced and produced pairs. Let & be a ring with identity element 
We shall use the terms right, left and two-sided R-module in the customary 
way, but always assuming that the identity element of & acts as the identity 
operator. We shall abbreviate ‘‘right R-module”’ to ‘‘R-module.” 

Let S be a ring with identity element, x a homomorphism of S into Kk 
mapping the identity element of S into that of R. Then every R-module is 
also an Sx-module and hence an S-module. 

If M is an S-module, the product M @ xs R becomes an R-module when on 
defines (u @ r)x = u @ rx for u M, andr, x R. The ® notation ts that 
of (1), 17 @s R denoting the tensor product over S of the S-module M with 
the left S-module R. The pair consisting of this R-module and the natural 
homomorphism «: M— M @ sR, which is an S-homomorphism, we shall 
refer to as the canonical (R,S, x)-produced pair determined by M. We shall 
omit the (R, S, x) when no confusion can occur. 

lhe term (R, S, x)-produced pair determined by M will be used to refer t 


Received April 3, 1956; in revised form September 8, 1956 











20 D. G. HIGMAN 


those pairs consisting of an R-module P(M) and an S-homomorphism 
ky:M — P(M) which satisfy the condition 

(P) for each R-module N there exists a homomorphism a—a* of Homs(M, N 
into Hom,p(P(M), N) such that for a Hom s(M, N), a*® is the unique element 
of Homer(P(M), N) which makes the diagram 


P(M) 





commutative. 

The Hom notation is that of (1), Homs(A/, N), for example, denoting the 
module of S-homomorphisms of M into N 

Taking * to be the natural homomorphism of Homs(J/, N) into 
Hom,(M @5 R, N), we find that the canonical-produced pair satisfies (1). It 
follows that any produced pair (P(M),«,,) determined by M is isomorphic 
with the canonical one, i.e., that there exists an R-isomorphism @¢ of P(M 
onto M @¢« KR such that the diagram 


pm) 2s M & P 


A 
K j 
M , 


M 


is commutative (2). 

The canonical (R, S, x)-induced pair determined by M consists of the module 
Hom s(R,M), made into an R-module by setting {7 (r) =f (rx) for f€ Hom 5(R,M 
and r, x € R, together with the natural homomorphism e: Homs (RX, M)— MM, 
which is an S-homomorphism. An (RK, S, x)-induced pair determined by 


‘The considerations in (2) were conceived of primarily as generalizations to rings and algebras 
of certain parts of the theory of representations of finite groups. The term induced module 
was used in connection with M ® ¢ R because of its relation to the classical construction for 
the induced representation in that theory. This turns out to be unfortunate because produced 
modules are then injective, induced modules projective, and because of the conflict with the 
terminology of the representation theory of topological groups. We have therefore switched 
notation and terminology in the present paper, interchanging ‘‘produced” with “induced” 


and “P" with “IT” throughout. 


rf 


ic 


RELATIVE COHOMOLOGY 2) 


VW consists of an R-module J(M) and an S-homomorphism ¢«,: /(./) — M, 
satisfying the dual of condition (P), namely 

(1) for each R-module N there exists a homomorphism 8 — 8+ of Homs(N, M 
into Home(N, [(M)) such that for 8 © Homs(N, M), 8* is the unique element 
of Hom,p(N, I(.M)) making the diagram 


LM) 

, 
eM | 

<—— \\ 
M B 


commutative 


lhe canonical-produced pair satishes (1) with the natural homomorphism 


Hom .(N, M 


> Home(N, Homs(R, W)) 


as +. Every produced pair determined by M is isomorphic with the canonical 
one (2). 

Henceforth, (P(M), xy) and (/(.M), ey) will denote respectively an 
(RK, S, x)-produced pair and induced pair determined by M. A subscript 
(R, S) or (R, S, x) will be used to obtain a more explicit notation when desired ; 
thus J;2.s)(.M) for J(.M), and so on. 

In case M is an S-module by virtue of being an X-module, there exist by 
(P) and (1) unique R-homomorphisms 


ty: P(M) — M,ju:Me1I(M), kyty = Ju éu = | 


Phen xy, and jy are | — 1 while ty, and €,, are onto. For the canonical pairs, 
ty is the natural homomorphism M @s R — M,j, the natural homomorphism 
M — Homs(R, M). We shall denote by K(M) the kernel of ty, A(M 
P(M)(1—tyk ay), and by L(M) the cokernel of jy, L(M)=1(M)/I(M)eujn 
Thus K(M) and L(M) are R-modules determined up to R-isomorphisms 
independently of the particular choice of induced and produced pairs. It will 
be convenient to introduce the notation 9, for the injection K(M)— /(M 
and wy for the projection P(M)— L(M). Note that there exists a unique 
S-homomorphism 


Aw: L(M) — P(M), Aga = 1, tadAw = 1 — End. 


2. The Z,-modules H‘(\/, N) and H‘(M, N). Given a ring X, Zy will 
denote the center of X. If M is an S-module and N an R-module, the module 
Homs(M, N) becomes a Zp-module when we define f*(u) = f(u)z for 
f€ Homs(M, N), u M and z © Z. If M is an R-module, Hom,g(M, N) isa 
Ze-submodule of Homs(A/, N). We can verify that the homomorphisms 











oe D. G. HIGMAN 


*: Homs(M/, N) —~ Homeg(P(M), N), +: Homs(N, MW) — Hom,(N,/(M) 


ol conditions (P) and (1) are Zg-homomorphisms. 

Suppose now that WV and N are R-modules. We obtain a Zg-homomorphism 
5y.v of Homs(M, N) into Hom,(K(M), N) by following * with the homo- 
morphism of Hom,g(P(M), N) into Hom,g(K(M), N) induced by the injection 
ny of K(M) into P(M). Thus, for f Hom s(M, N), 


bu N 
j = nu)" 


rhe kernel of 64. y is Homg(M, N). In fact, if f is in this kernel, 


O = (1 — ty Ka) nuf* = f* — tal 
Hence (yf = /* is an R-homomorphism, and, since /, is an R-homomorphism 
onto, f is an R-homomorphism. On the other hand, if / Hom,(M, N 
f* = tyf and 
Oatn , 
f = nuly/ = VU. 


Dually, we may define a Zpg-homomorphism 6, .y of Homgs(N, .V/) into 
Homp,(N, L(.M)), namely, the product of + with the homomorphism o! 
Home(N, [(.M)) into Hom,g(N, L(M)) induced by ry 


Om N + 
g = £ mu 
lor ¢g Hom «(.V, .M). The kernel of 6 y is Hom,(N, VM 
The R-module J determines R-modules K‘(M) and P'(.M eS: &. 


defined by the recursive formulas 
K°(M) = P°(M) = M, K*'(M K(AK'(M)), P'(M P(K'(M) 


Dually, JJ determines R-modules L'(.\/) and ['(.M), = (), I, . according 


to 


L°(M) = I°(M) M, L***(M L(L'(M)), [*'(M) = 1(L'(M 


We shall now define a Zg-complex (C(M, N), 6), determined by the ordered 
pair ./, N of R-modules, by letting 


C'(M, N) = Hom,(K‘'(M),N), & = bxican for «>0, 
and 
C(M, N) = (0), 6=Otori <0 
We have 6 5‘ = 0 for all 7, since, for 1 > 0, the image B'(.M, NV) of 6 


is contained in Hom,g(K‘'(M), NV), which is the kernel of 6‘. The cohomolog 
groups of this complex, which are Zg-modules, and which are determined 
up to Zr-isomorphism independently of the particular choice of induced 
pair determined by JA/, we shall denote by H‘(.M/, N). More explicit notation 


*This amounts to constructing the standard (R, S)-projective and injective resolutions olf 


V as in (8) 


i) 


ito 


sO 


RELATIVE COHOMOLOGY 2 


such as H';p2.5) (M, N) will be used where desirable.’ It is immediate that 
H°®(M, N) = Hom,(M, N), and, for i > 0, a > 0, 


H'*= (M,N) = H'(K*(M), N 


We may dualize the above construction to obtain a second Z,-complex 


C(M, N), 6), taking 
C'(M, N) = Homs(N, L'(M)), 8 = bsian ww, i> 0, 


and C'(M,N) = (0), & = 0 for i < 0. We shall denote the image of 6*' 
by B‘(M, N), and the cohomology groups of this complex by A‘(M, N) 
These are again Zp-modules, determined up to Zg-isomorphism independently 
of the choice of produced pair determined by M. We have 


ff°(M, N) = Hom,(N, M), A (M,N) = AY L*(M), N 


for allt > 0, a > O. 


3. The isomorphism ¢. Let \/ and N be S-modules. There is a Z”/\S 
isomorphism ¢ of Homs(P(M), NV) onto Hom s(.M, /(.N)) mapping 


fe Homs(P(M), N 


onto /* = «yf*. The inverse r of ¢ maps g © Homs(.\/, /(.V)) onto g’ g* én 
In fact, (xyft+)* = f+, and hence f*? = (xyft+)*ey = ftey = f, so that or = | 
Dually, ro = 1. In the case of the canonical-induced and -produced pairs, ¢ is 
the natural homomorphism of Hom s(./@ sR,N) onto Hom s(./,Hom 5(R,M) 

Now assume that M and N are R-modules so that K(.W/) and L(N) are 
defined. We then have a Zg/\S,-isomorphism ¢ of Hom s(K(M]), V) onto 
Hom s(M, L(N)), mapping / Hom .(AK(.W), NV) onto 


“ go 
(" = ne. — tyyky) fl Tn. 


Inverse to ¢ is 7 defined by 


3 


7 T 
g = na(gdAw) 
lor g Hom s(.\/, L(.N)). Indeed, 


G n + : 
f Aw = Kwl(l — taka) ff] 9wAw = wal Cl — trek SIO €nJn) 
+ ° a 
= Kul(L — tarea)f) — Ka (l — trekar)fin = (CL — taker )fl 
Hence 
aT 6 T nid 
/ =nuw(f Aw) = nal(l — tarkar) fl I, 
whence 67 = 1. Dually, 7¢ = 1 
It can easily be proved that H (R: 8) WV, N is isomorphic with Ex (R: Sx (M, N) as detined 


by Hochschild (8). When this is taken into consideration, the connection between the results 
of §§2-7 of the present paper and (8) will be seen 











24 D. G. HIGMAN 


We shall show that the diagram 


Hom<(M,N) #8 Hom, (K(MIN) 


5NM = 
~ 


Hom. (M,LIN)) 


is anti-commutative. If / Hom s(V, N), 





r - : 
(1 — tagka)f = (Ll — tagka)nal* = f* — tal. 
Hence 
bag . oO o Oo 
/ = i -_ tas} ) vy = ]* fv (Last) ust 
But 
0 ' 
[* ty = Ky(f*) tw = Kul *Intn = V, 
and 
G ‘ ; , 6 
(tagh) Ww = Kalla) tw = Kagtayl ty f"ay =f. 
Llence 
Fad he o 
Moreover, the diagram 
i/(M\ NI 
Hom (Kl 1), N) P 
”N,K(M) 
—_ 
om | 
y Ss 
l h . 1\\ = VM) ip \ 
Home(M,L(N)) > Hom,(K( LUND) 
a > 
M,L(N) 
is commutative. For, if / Hom s.(K(M), N), 
f = KKkim [dd _ lr uw)KeiM j " x, 
hence 
DO. . r 
(f )*=((1l— tk MAK (A Ty 
hus 
a6 ae , 
I = nxao(f )* = ncunl(l — teankxun)/| Os 
' , 6 
= nxeaol(l — teankcan)fl avy =f tw =f, 


proving the desired result. 









RELATIVE COHOMOLOGY 


\pplying these results to the complexes (C(M, N),6) and (C(N, M), 
we obtain an anti-commutative diagram 
el 


»>C(M, N) > C°(M,N) > C'(M,.N) 5 CCM. N) 


lo go 


, 


a 0 ; 
3 O(N, M) @(N, M) > C’'(N, M) > @(N, M) >. 
Consequently 


THEOREM |. @ induces a Zpl\ S, isomorphism of H'(M, N) onto H'(N, M 
fori >O 


Corotitary. H'te(M,N)~H'(M,L*(N)) and H'+*(M,N)~ A‘(M, Ke(N 


ori > 0, a > O, the isomorphisms being Zp (\ S,-isomorphisms 


4. Some special induced and produced pairs. Jo give explicit con 
structions for the cohomology groups introduced above one need only supply 
particular induced and produced pairs, the canonical ones not always being 
the most suitable. Thus, for example, suppose that B and A are rings, and let 
T be a subring of A containing the identity element thereof. Let M be a 
B’ ® T-module. Here the ’ indicates mirror image, and @ the tensor product 
over the ring of the rational integers. A B’ @ T-module may be considered 
as a left B-, and a right 7-module, and we shall use corresponding notation 
where convenient. The module M @,7,A becomes a B’ @ A-module when 
we let b(u @ a) = bu Qa and (u @a)x = ax for b B, u M and a, 


‘ A, while Hom,(A, J) becomes a B’ @ A-module when we let °f(a 

b| f(a)| and f*(a) = f(xa) tor b B,f Hom,(A, M) and x, a A. As ma) 
be seen by verifying (1) and (P), ning these modules with the natural 
homomorphisms M— M @7A ana om 7(A, M) — M yields respectively 


i (B’ @ A, B’ @ T, x)-induced and produced pair determined by M, where 
x is the natural homomorphism B’ @ T — B’ @ A. If Misa B’ @ A-module, 
K(M) and L(M) are the respective kernels oi natural homomorphisms 
il @®,A—M and M —Hom,(A, M). 

Now let ‘MV be a B’ @ T-module, N a B’ @ A-module. Further, let LU be a 
subring of T(\Z,, containing the identity element of A. The module 
Hom pre 1 M, N) attains the status of a JT’ @, A-module when we define 


t 


f(u) = f(ut) and f* (uw) = f(u)x 


ior 


Hom py 9 (M,N), l rou 1, u VV; 


let us denote it by ®(M, N). If M is a B’ @ A-module we may replace 
above by y A, turning @(M, N) into an A’ @y A-module. We shall outline 


how one inay prove’ 


Combining Theorems | and 2 gives Theorem 2 of (8) 








26 D. G. HIGMAN 


THEOREM 2. Jf M and N are B' @ A-modules, then there exists a natural 
Z ,-tsomor phism of 


I] (M,N) onto Al (@(M, V),.1) 
) 


(B’@A, B’@T) 
where C = A’ @vyA,D=T' @®vA 


C, D 


The first of the cohomology groups mentioned is a ZB’ @.{-module, the 
second a Z~-module, so it makes sense to speak of a Z,-isomorphism between 
them. The natural homomorphism 7: D — C is understood. 

The main step in the proof consists in the construction of a suitable (C, D, n)- 
induced pair for 6(M, N). To this end, consider the diagram 


O(P(M) N) 


€ | \Ee 


Ww . 
O(M,N) <——-H 
, 
BS 
where (a) ¢ is the D-homomorphism induced by «xy: — P(M), (P(A), «x 
being a (B’ @ A, B’ @ T, x)-produced pair determined by M, and (b) 8 * 
is a D-homomorphism of the C-module H. If now 8+ is a C-homomorphism 


such that the diagram is commutative, then for h IT, u M, (hB*) (16K 5 
(hB8)(u), hence for a A, 
(hB™ )(uxy.a) = “(hB*) (uy |(ah)B}(1), 
1.€., 
1.1 (hB™ ) (UK y.a) |(ah)B| (1) 


On the other hand, we may verify that the formula 4.1 does indeed define a 
C-homomorphism making the diagram commutative. Consequently the pai 
(@(P(M), N), ©) isa (C, D, »)-induced pair for @(.M/, N 

If Misa B’ ® A-module, (VM, N) is a C-module, and the homomorphism 
ty: P(M) — M induces the unique A’ ®y A-homomorphism j: &(\/, N) 
#(P(M), N) such that je = 1, as is seen by taking 8 = | and H = (M,N ’ 
in 4.1. Consequently the sequence 


0) + (M,N) 5 o(P(M), VY) 4 @(K(M), VN) > ©) 


is exact, where y is induced by the injection K(.W) — P(M). In the complex 
for constructing the groups fl‘(@(M,N), A) from the produced pair 
(@(/(.4M), N), €) we may therefore identify L(@(./, N)) with @(K(M), N) and 
the projection of @(7(M), N) onto L( (MM, N)) with 


The modules Hom pg 7‘ M,N) and Homp(A, ®(M, N)) are in particular 


Z ,-modules, the first being a Zpro4- and the second a Ze-module. The 
B'@A 








RELATIVE COHOMOLOGY 





natural isomorphism between them is a Z,-isomorphism. One may now 
verily the commutativity of the diagram 


Hom pg r M,N)—- Hom 5 pK (M A 
> > ob ‘ ) \ 
Hom (1, ®(M, V)) Hom) (-1, b(K (A), 
where (1) the arrows pointing down represent the natural isomorphisms, 


(2) the top arrow represents 6y.y, and (3) the bottom arrow represents th« 
product of 


+: Hom »)(.1, (iM, V)) — Hom | 1, (P(M), N)) 


with the homomorphism of the second of these modules into 


Hom p(A , &(K(M), N 


induced by y. Application of this fact to the appropriate complexes gives 


Theorem 2 


5. Generic cocycles. Let 1/, N and X be K-modules, and let 
f Hom s(K (AM), N). 


Then ps: g—ef for g Hom s(X, A (M/)) defines a Zg (0) S,-homomorphism 
u, of Homs(X, K(M)) into Hom.(X, N). If 


Homp(K (4M), V 


we see that wy is a Zg-homomorphism, and moreover, that the diagram 


Hom.(X,K(M)) °*5#° Homg(K(X), K(M)) 


My My 


; by A 
Hom s(X, .\) ; Hom s(A(X ), \ 
is commutative. For then 
uo Ou 
g = nx(g/)” = nx g 


\pplication of this to the complexes (C(X, A(M)),6) and (C(X, N),4 
proves that uw, induces a Zpg-homomorphism of H*(X, A(A\/)) into H*(X, N 
lor all a > O. In particular 

If f Hom 5(K‘(M), N), uw, induces a Zpg-homomorphism of H*(M, A'(M 
nto H*(M, N) foralla > 0,and p,: I‘ — f, where I' is the identity automorphism 
of K'(M 

We shall refer to the element / B‘(M, K'(4/)) as the first generic i-cocye 
determined by M 

Dually, if 

Homg,(N, L(M)), Ay: g—ofg 


defines a Zg0\ S,-homomorphism \, of Homs(L(.M), X) into Homs(N, A 











28 D. G. HIGMAN 


which is a Zpg-homomorphism if / Homeg(N, L(AM)). In the latter case, the 
diagram 


Hom (L(M), X) “*£° Hom «(L(M), L(X)) 
Ay A, 


. 


Hom s(NV, X ) 


Wx.N Hom «(.V, L(X) 
is commutative. Hence A; induces a Zg-homomorphism of H*(X, L(.M)) into 
H«(X, N), a > 0. In particular 

Iff € Hom(N, L*(M)), \, induces a Z p-homomor phism of H«=(N, L‘(M)) into 
H=(M, N) for all a > 0, and d,: J'— f, where J‘ is the identity automorphism 
of L'(M). 

We shall refer to the element J B‘(M, K‘(M)) as the second generic 
i-cocycle determined by M 


As a consequence of the above considerations we obtain 


THEOREM 3. If iis a positive integer, and M is an R-module, then the following 
conditions imply each other. 


(a) H*(M, N) = (0) for all R-modules N 


(b) H'(M, K*(M)) = (0). 
(c) 7 B‘(M, N). 
(d) H**#(M, N) = (0) for all R-modules N and all a > 0 


The dual Theorem 3’ is obtained by replacing H with A, K with L, B with 


B, and T with J 


6. Protractions and retractions.’ Let \/ be an A-module. If // ts an 
R-module and @: M—H is an R-homomorphism, we shall call the pair 
(11, @) and (R, S)-protraction of M provided that there exists an S-homo 
morphism \: 1 — H such that Ad = |. The kernel N = H(1l — A) ol @ 
will be called the kernel of (11,¢). Two (R,S)-protractions (/J;, $1) and 
(HTs,¢2) of M with kernel N will be called R-isomorphic if there exists an 
R-isomorphism yu of H, onto H, such that ¢; = ude 

Corresponding to an (R, S)-protraction of M with kernel N there ts ai 
element f, Home(K(M), N) defined by 


fy, = gydA*(1 or 


It can be seen that the correspondence (//, ¢) — /, induces a |—-1 mapping 
of the set of classes of R-isomorphic (2, S)-protractions of M with kernel 
N onto I’(M, N), which becomes a group isomorphism when the Baer 
composition is introduced into the set of classes of protractions. 

The (XR, S)-protraction (H, ¢) of M with kernel N is said to split if there 
exists an R-homomorphism a: M — H such that a@ = |. Two R-isomorphi« 
protractions split, or do not, together. Let f, © Hom,g(K(M), N) correspond 


‘The material of this section overlaps considerably with Cartan and Eilenberg (1, §6, Ch. IT) 


as well as with Hochschild (8) 





are 








RELATIVE COHOMOLOGY 29 


to (H, @) as above. Then it is easy to see that each of the following conditions 
is necessary and sufficient for (27, @) to split: 

(a) there exists an R-submodule N* of 77 such that JJ] = N @ N*, 

b) H=~ N ® M as an R-module, 

(c) fi. € B'(M, N). 

Che pair (J(M), ty) is an (R, S)-protraction of M with kernel A (4), an 
corresponds to the class of the first generic 1-cocvcle 


[i =f 


Kw 
\n R-module M will be called (R, S)-projective if, whenever (//’, @) is an 
R, S)-protraction of an R-module H and a: M — H is an R-homomorphism, 
there exists an R-homomorphism &@: M — H such that a = &@. From (2, 
lheorem 6) Theorem 3, and the above remarks we conclude 


THEOREM 4. Each of the following conditions is necessary and sufficient for 
an R-module M to be (R, S)-projective 

(a) The (R, S)-protraction (P(M), ty.) splits 

(b) Every (R, S)-protraction of M splits. 

(c) H'(M, N) = (0) for all R-modules N. 

If U is an S-module, the R-module P(L’) is (2, S)-projective according to 
(2, Theorem 3). Hence we have 

Coro._Lary. Jf U is an S-module, H'(P(U), N) = (0) for all R-modules N 
and all i > 0. 


A pair (H,y) consisting of an R-module H and an R-homomorphism 
¥: M — H will be called an (XR, S)-retraction of M with kernel N if there 
exists an S-homomorphism yu: 7 — M such that pu = 1, and if N is the 
cokernel of y, N = H/ My. This is the dual of the concept of (2, S)-protraction 
We define R-isomorphism between retractions by dualizing the corresponding 
concept for protractions, and obtain a 1-1 correspondence between the set 
of classes of isomorphic (2, S)-retractions of M with cokernel N and the 
elements of /7'(.M, N) (and hence of I/'(N, M) by Theorem 1). The definition 
of splitting for (2, S)-retractions is dual to that for protractions. Of course 
there is a 1-1 correspondence between the set of (X, S)-protractions of M/ 
with kernel N and the (2, S)-retractions of N with cokernel N, such that a 
protraction splits if and only if the corresponding retraction splits. 

(P(M), jm) is an (R, S)-retraction of M with cokernel L(M), and corre- 
sponds to the class of the second generic 1-cocycle J' 

Dual to (2, S)-projective modules we have (RX, S)-injective modules, and 
dual to Theorem 4 we have 


THEOREM 4’. The following conditions 
(a) The (R, S)-retraction (I(M), jx) of M splits 
(b) Every (R, S)-retraction of M splits 
c) I"(M, N) = (0) for all R-modules N 
are each necessary and sufficient for an R-module M to be (R, S)-injective 











30 D. G. HIGMAN 


lf U is an S-module, /(U’) is (RX, S)-injective according to (2, Theorem 3’) 
Consequently 


COROLLARY. For every S-module U and every R-module N, H*(I(U), N) = (0 
foralli>O 


7. Cohomology dimension. l|{ an k-module M is (RK, S)-projective 
[injective], then according to (2, Theorem 6) so is K(M) [L(M)]. Under 
certain circumstances the converse is true. We shall consider the hypothesis 

(R,S; M) There exists an R-isomorphism py of I(M) onto P(M). 

If (R,S; M) holds, then M is (2, S)-projective if and only if it is (R, S)- 
injective, as follows from (2, Theorems 6, 6’) 


THEOREM 5. Suppose that the hypothesis (R,S;K(M)) holds. Then K(M) 
(R, S)-projective implies that M ts (R, S)-projective 


Proof. If (R,S;K(M)) holds and K(M) is (RX, S)-projective then K(M 
is (R, S)-injective. Hence the (XR, .S)-retraction (P(M), 4) of K(M) splits 
by (2, Theorem 6’). Hence P(.\J)~ M @ K(M) as an R-module, so that M/ 
is (R, S)-projective by (2, Theorem 6). 

The dual Theorem 5’ is obtained by replacing K with L and projective 
with injective. 

It will be convenient to denote by d:g.s) M the smallest integer 1 > 0 such 
that Z7'(M, N) = (0) for all R-modules N, if such an i exists, setting dip 5 
M = @ otherwise. The dual diz.s) M is defined by replacing H by A. By 
Theorems 4, 4’, diz.s) M < 1 if and only if K‘~'(M) is (R, S)-projective, while 
dig.s) M <i if and only if L*' (M) is (R, S)-injective. By Theorem 5, if 
(R, S; K*(M)) holds, then dips, M <i+ 1 implies dizg.s, M <1, while if 
(R, S; L‘(M)) holds, then diz x) M < i + 1 implies diz.s) M < i. 

The two conditions 

(c.1) dips) M <i for all R-modules M 

(c’.i) din.s) M < i for all R-modules M. 
imply each other as we deduce at once from Theorem |. We define class (R, S 
to be the minimum integer 7 > 0 such that (c.2) and (c’.1) hold if such 
an 7 exists, letting class (R,S) = @ otherwise. From the above we have 


THEOREM 6. Jf the hypothesis (R,S; M) holds for every R-module M, then 
class (R,S) < © implies class (R,S) = | 


The hypotheses of this theorem are satisfied if R is a self dual S-ring in the 
sense of (2). 
Now let us suppose that B, A, T and U are rings as in §4, and let JJ and N 


be B’ @ A-modules. If 


((B@A. B’@T) 


th 


In 


th 


\l 
wl 


Lo 





en 





RELATIVE COHOMOLOG\ 


then /'(.M)~ A‘'(M) @ K'*'(AM) asa B’ ®@ A-module, according to Theorem 
6 of (2). Consequently 


&(1'(M), N)=~ ®(K'(M), N) @O(K*'(M), N 
as an A’ @y A-module, whence one concludes by the construction of §4 that 


qd M,N) € 1. 


A’@vA, T’@vA )' 
In particular, if M is (B’ @ A, B’ @ T)-projective then @(M, N) is (A’ @yA, 
T’ ® » A)-injective (8, Lemma 2). Moreover, we conclude that 


class(.4’@y.1, T’@uvA Jad d | 


(A'@vA, T’'@vA) ~ “(4'@uA. T’@vAY 
> class(B’ @ A, B’ @7), 


considering A in the natural way as an A’ @y A-module. 


8. The ideals 3‘(1/) and 3‘(1/). The results of §5 can be refined as 
follows. We shall denote by Q3'(4/, N) for 32.5) (M, N)] the annihilator 


of the Zpg-module H/‘(M, N). Letting 3‘(M/) denote the intersection over all 
R-modules N of the ideals 3‘(.\7, N) we have by §5 that 


XCM) = 3'(M, K'(M)) 


= fw € Zeltw € B'(M, K‘(M))}. 
Here fw denotes right operation by w; fw: u— uw for u € K*'(M), w © Zp, 
so that fw = J. Condition (a) of Theorem 3 is equivalent to the condition 
that 3'(M) = Zp. 

Dually, we define ¥'(M, N) to be the annihilator of the Z,-module 
fl'(M, N), and ¥‘(M) to be the intersection over all R-modules N of these 
ideals. Then 

¥4(M) = ¥(M, L*(M)) 
fw € Zeltw € BY(M, L‘(M))}. 


The condition dual to condition (a) of Theorem 3 is equivalent to the condition 
that ¥4(M) = Ze. 
We have at once that for 7 > 0, a > 0, 
¥7*(M,N) = 3°(K*(M), N), 


Moreover, bv Theorem 1, 


(Mu, oe ¥(L"(M), V). 


VMN) S = BN, M)N S, 
while the Corollary to Theorem | implies that 
(MNS, SI (M), BXUNNS, CIM 
lor: > O, a > 0. 








32 D. G. HIGMAN 


Let (/7,¢) be an (X&, S)-protraction of the R-module M with kernel N 
($6). Then there exists an S-homomorphism \: M— H such that A¢ = 1, 
and N = H(1 — $d). The corresponding element fy Hom ,(M, N) is defined 
by fx = nyA*(1 — Ad). 


PROPOSITION. Jf w © Zp, then fy B'(M, N) if and only if there exists 
an R-homomorphism 8: M—H such that Bd = {w, where (w denotes right 
operation on M by w. 


Proof Suppose that 


w Own 
Ul 


fA = g Hom s(M, \ 


This means that 97,,A*(1 — @A)fw = nyg*. Then, if » is the injection N — /], 


O = nylA*(1 — OA)oo — naug*ln = nulA*iwo — g*n] 


, ; 6 
= nulAfw — gn|* = [Ato — gn] oan 


Consequently 8B = Atw — gyn is an element of Homg(M, H). Further, 8¢ = 
AC{wh — end = fw. 
Suppose on the other hand that there exists an R-homomorphism 6: M — I] 
such that Bd = fw. Since 
nuB* = nubly = 0, 


if we let y = Atw — B, we have 


 w ; 3 
(fx )n = nu (A*to — B*) = nywy* = gn, 


where g = y(fw — dy) is an element of Hom;(M, N). Hence 


) 6 
 - = Mu ait 

There is the dual result for (2, S)-retractions of M. 

Application of this proposition to the (R, S)-protraction (P(M), ty) of M 
gives (a) of the following theorem, (a’) being its dual 


THEOREM 7. If w € Zp, then 
(a) w ¥'(.M) if and only if there exists B Homep(\J, P(M)) such that 
Bt = Cw. 


(a’) w ¥'(M) if and only if there exists B Homp(l(M), M) such that 


iw = tw. 





I(M) —“— p@m 






I] 


al 





RELATIVE COHOMOLOGY 


\ssuming hypothesis (R, S; M) of §7, namely, the existence of an R-iso- 


morphism py of J(M) onto P(M), we may construct the Casimir operators 
as in (2). Thus, if a € Homs(M, M), cla) = atyyty and é(a) = jypya* 
are elements of Homg(M, M). We may call c and é the first and second 
Casimir operators associated with M. They are of course dependent on yp 4, 


THEOREM 8. If (R,S; M) holds, then an element w © Zp is contained in 
¥’(M) if and only if there exists an S-endomorphism a of M such that cla) = tw 


Proof. Suppose a is an S-endomorphism of M/ such that c(a tw. Then 
8 = atuy is an R-homomorphism of M into /(.M), and 


Bly = atuy ty = cla) = tw 
Hence w ¥'(M) by Theorem 7. 
On the other hand, w € 93’(M) implies by Theorem 7 the existence of an 
R-homomorphism 8: M—IJ(M) such that Bly = tw. Now a = Buy'ey is 


an S-endomorphism of M such that 


' 1 ' . . 
Cla) = a uyly = (Bum €y) warty = Bt = fw 


lhe dual of Theorem 8 is obtained by replacing 3 by ¥ and c by é 

The condition (R,S; 1) may be strengthened by demanding that « i 
Let us denote the resulting condition by (2, S; 1/)*. If R is a self-dual S-ring 
in the sense of (2), (R,.S; M)* holds for all R-modules M/ (2). Comparing 


Theorem 8 and its dual we have 
CorotLary 1. Jf (R,S; M)+ holds then X*(M) = ¥'(M 
We can also prove 


CoROLLARY 2. Jf (R,S;K(M))* holds then 3°7(M)OS, = #(ANOS,, 
while dually, if (R, S; L(M))* holds then ¥*(M) CO S, = YUNA S, 


Proof. By Corollary 1, 32(M) = 3'(K(M)) = 3'(K(4M)). By Theorem 
| there exists a Zp) S,-isomorphism of '(K(M), M) onto H'(M. K(M 
It follows that &(K(M)) OS, © 3°(M, K(M)) = 2'(¥ Since 


+'(M) OVS, © 3?(M), the corollary is proved. 


The above results may be extended by using the recursion relations 
¥*(M/) = ¥'(Ke(M)) and Yytte( M) = ¥(L2(M)). Thus, for example, we 
have by Coroilary 2 that if (R,.S; 11)+ holds for all R-modules M/. then fo 


i> 0, 
MONS, = MOS, 
and 


¥'(M) 0 S, = 3°(M)N S.. 


The methods of §4 may be used to give further information concerning these 





34 D. G. HIGMAN 


ideals in the case considered there. For example, if M is an A’ @y A-module 
we find that 


3 (A)N Dre ¥ (M) j 


(C, D) 


where the notation is that of §4 


C,D 


To see how ideals of the kind considered here occur in the study of orders 


in algebras, see (3) and (4). 
Art 
’ not 
REFERENCES poi 
1. H. Cartan and S. Eilenberg, Homological algebra (Princeton, 1956) | 
2. D. G. Higman, Induced and produced modules, Can. J. Math., 7 (1955), 490-508 
3. , On orders in separable algebras, Can. J. Math., 7 (1955), 509-515. Th 
4. , On representations of orders (in preparation 
5. G. Hochschild, On the cohomology theory for associative algebras, Ann. Math., 46 (1945 i 
58-67. 
6. , On the cohomology groups of an associative algebra, Ann. Math., 47 (1946), 568-579 l 
ve , Cohomology and representations of algebras, Duke Math. J., 14 (1947), 921-948 pol 
8. , Relative Homological algebra, Amer. Math. Soc. Trans., 82 (1956), 246-269 deg 
ila 
Montana State University and { 
University of Michigan on 
ma 
wh 
f(: 
a’ 
( 
ext 
} con 
a’ 
of 
' rel 
lor 
; 
, 





SOME REMARKS ON NOETHERIAN RINGS 
MICHIO YOSHIDA 


In his lecture at the University of Kyoto on September 23, 1955, Professor 
Artin gave an important theorem on Noetherian rings, which seems to have 
not a few interesting consequences. It is the purpose of our present note to 
point out one of them. We begin by quoting a special case of the theorem. 


THEOREM. Let R bea Noetherian ring with unit element, and a, 6 ideals of R 
Then there exists a positive integer d such that 


a" (\6 = a*—"(a’ (\ 5b) a>r>d 

Proof. Let {ai,...,@,_,} be a system of generators of a, and consider the 
polynomial ring Rix] = R[x:,...,x,]. Denote by A, the set of forms of 
degree r in R[x], and by B, the set of all the forms f(x) of degree r such that 
f(a,,...,@m) © 6. A, is a R-module, B, is a submodule of A,, and obviousl\ 
A,_,-B, C B, for n > r. We select a finite system of forms f;(x), 1 < i < / 
from {B,; r = 0,1,2,...} such that any form f(x) of {B,:; r = 0, 1, 2, 


may be represented as 
f= Do e-/ 


where ¢,'s are forms of R[x]. Denote by d the maximum of the degrees of 
f(x), 1 <7 <1, then for n >r>d, A,_,-B, = B,, namely a" (\b = a" 
(a’ (\ b) 


By taking a principal ideal for 6, we obtain the following 
COROLLARY. Let a be an ideal of R, and a a nonzero-divisor of R, then theré 
exists a positive integer d such that 
a": Ra = a"-"(a": Ra n>r>d, 


consequently 


Though Professor Artin did not mention this corollary, the last formula 
a": Ra € a"~’ is of some interest. This is really a satisfactory generalization 
of a well-known theorem (1, p. 699, Lemma 9; 5, p. 38, Lemma 1). We would 
refer readers to a remark by Samuel on this kind of formula (2, p. 34). This 
formula enables us to sharpen one of his results (2, p. 23) as follows 


Received October 14, 1955 











36 MICHIO YOSHIDA 


THEOREM 1. Let a be an ideal of Noetherian ring R. If a contains at least 
one nonzero-divisor, then there exists an element a of a such that 


a*t’: Ra = a 


r 


for sufficiently large n, where r is determined by a € a’ and a ¢ a’*". 


Proof. Put 


” 
n= (\a", "R= R/n, “a = a/n. 
n=1 
It is easily seen e.g. by the intersection theorem (4, p. 180, Theorem 3) that 
*q contains at least one nonzero-divisor and that any prime ideal of the zero 
ideal of *R is closed and not open in *a-adic topology. So Samuel’s observations 
on the ring of forms F(*a) = >°*a‘/*a‘t! (2, p. 22-23) ensure the existence of 
a superficial element *a of some degree r with respect to *a, which is not a 
zero-divisor. Hence *a"*’: *R*a = *a" for sufficiently large n. Any element 
in the residue class *a will have the property required in the theorem. 


COROLLARY. Under the same issumption on a, there exist positive integers 
r, mo such that 
= q"’, n> Np. 


We do not know whether we can always take 1 for r in this corollary, but 
Samuel (3, p. 177, Theorem 10) tells us the following: 


THEOREM. Let A be a local ring with the maximal ideal m, and let q be an 
m-primary ideal. Suppose m contains at least one nonzero-divisor, then 


q": q = qr"! for sufficiently large n 

Proof. in the case that the residue field k = A/m is infinite, his assertion 
is substantiated by the existence of a superficial element of degree 1 with 
respect to q, which is not a zero-divisor (2, p. 23) .The other case that k is 
finite shall be reduced to the former case by the following device. Form the 
polynomial ring A[x] in an indeterminate X, then form the ring of quotients 
A* of mA[x] with respect to A[x]. The residue field of A* is k(x), hence 

q"*A*: gA* = q*"'A*. 
Notice that 
(q"A*: gA*) (\A =": q, Q"A* N\A = qh". 

Before we transform the above theorems by ‘“globalization,’’» we shall 
recall some definitions and well-known facts. Let 3 be a prime ideal of R, 
and q a 3-primary ideal. The 3-primary component of q" is called nth symbolic 
power of q, and usually denoted by q’. Let a be an ideal of R, and 2, ..., 2; 
be the minimal prime ideals of a. The intersection of the z,-primary compo- 


ast 


an 


REMARKS ON NOETHERIAN RINGS 37 


nents (1 < i <1) of a” is called mth symbolic power of a, and denoted by 
a”. If q, denotes the z,-primary component of a, then as is well known 


a” =qi f)\...Nar. 
We denote by S the complement of 
i 
U bs 
i=1 
in X, and form the ring of quotients RX, of S with respect to & in the Chevalley- 
Uzkov sense. We have then a” = a"R, (7) R. Let 


i * 
(O) =a. (7)... 1a, 


be a primary decomposition of the zero ideal of R, and let ;,* be the prime 
ideal of q,*. Assume 3,*°/(\S = @ for i=1,..., s and 3° °(\S # @ for 
t=s+1,...,t Then n =q,)*1)\...C\q,* is the kernel of the canonical 
homomorphism of R into Rs. Contracting of ideals of Rs on R and extending 
of ideals of R to Rs both give one-to-one mappings between the set of all 
ideals of Rs and the set of ideals of R whose prime ideals are disjoint with S. 
These mappings are the inverse of each other and they are isomorphisms with 
respect to the ideal operations ((\) and (:). We are now in a position to 
verify the following: 


THEOREM 2. Let a be an ideal of a Noetherian ring R. Suppose that any 
minimal prime ideal of a is not a prime ideal of (0). Then there exist an element 
a of a and a positive integer no such that 


a“t+”: Ra = q“ 


where r satisfies a © a” and a ¢a‘’*". Moreover 
a"+™. gim = @ 


for sufficiently large n and arbitrary m > 0. 


REFERENCES 


1. C. Chevalley, On the theory of local rings, Ann. Math. 44 (1943), 690-708 

2. P. Samuel, Algébre locale, Mem. Sci. Math., No. 123 (Paris, 1953) 

3. , Sur la notion de multiplicité en algébre et en géométrie algébrique, |. Math. pur. et 
appl. 30 (1950), 159-205. 

4. O. Zariski, Generalized semi-local rings, Sum. Bras. Math. 1 (1946), 169-195 

5. , Theory and applications of holomorphic functions on algebraic varieties over arbitrary 
fields, Mem. Amer. Math. Soc., No. 5 (New York, 1951) 


ITiroshima University 











SPACES OF DIMENSION ZERO 
BERNHARD BANASCHEWSKI 


1. Introduction. In a recent paper (1) it was remarked that the theory 
of zero-dimensional spaces is exactly that part of general topology which can 
be described in terms of equivalence relations. Here, it will be shown how this 
idea can be used to obtain the following characterizations of certain types ofl 
zero-dimensional spaces: 


Any compact zero-dimensional space which has a denumerable basis Jor its 
open sets and is dense in itself is homeomorphic io the space of 2-adic integers 
Any locally compact zero-dimensional space which is non-compact, has a 
denumerable basis for its open sets and is dense in itself, is homeomor phic to the 


space of 2-adic numbers. 


The first of these statements is a well-known theorem (6; vol. 2, §40, 11), 
whilst the second one, here occurring as a simple consequence of the former, 
was proved in (5). However, in both cases, the methods employed are rather 
different from ours which are, in fact, no more than a refinement of arguments 
used in (1). In similar ways, the following assertions concerning non- 
archimedean metric spaces will be proved: 


The number of inequivalent non-archimedean metrics on a non-compact 
zero-dimensional space which has a denumerable basis for its open sets and ts 
dense in itself, is at least equal to the power of the continuum. 

Any separable non-archimedean metric space can be mapped by a metri 
equivalence into the space of all formal power series with integral coefficients, 
taken with its so-called topology of formal convergence. 

Any two n-adic metric spaces are metrically equivalent to each other. 


2. Preliminaries. ‘Topological terms, unless otherwise stated, will be used 
in the sense of (3). The term “‘space’’ will always be taken to mean ‘‘Hausdorl 
space which has a denumerable basis for its open sets.’’ Zero-dimensionality 
means the existence of a basis for the open sets consisting of open-closed sets. 
Two metric spaces E and F are called metrically equivalent if there exists a 
homeomorphism from E onto F which is uniformly continuous in both direc- 
tions. Two metrics on the same space E are called equivalent if the identity 
mapping of E onto itself is a metric equivalence with respect to these metrics 
The topology of formal convergence (a term due to E. Witt) in the ring of all 
formal power series 


Received May 4, 1956. 


SPACES OF DIMENSION ZERO 5Y 


c, arbitrary elements from a given ring and z an indeterminate, is obtained 
by taking the ideals (z*) as a system of neighbourhoods of p = 0. The space 
of n-adic numbers, defined by completing the rational numbers with respect 
to the ring topology given by the ideals (m*), taken in its natural metric, will 
be referred to as the m-adic metric space. A uniform structure of a space 
is here a “‘systéme fondamentale des entourages”’ in the sense of (3; chap. 11), 
compatible with the topology of E. 

Equivalence relations on a set will be denoted by a, 8, y, . All equivalence 
relations considered here will be relations on some topological space LE. The 
a-class to which x © E belongs will be called a(x). The @ for which a(x) = E 
is called the all-relation. The number of a-classes into which E decomposes 
will be denoted by |a| and called the index of a. If each a-class is an open-closed 
set in E, a will be called open-closed. The expression a < 8 (‘a is finer than 8”’ 
means that each a-class is contained in some 6-class. If a < 8 and each 6-class 
contains the same number of a-classes, this number will be denoted by (8: a), 
called the index of @ in 8. Writing down (8: a) will always be meant to imply 
the existence of this number. 

As an immediate consequence of (1, Satz 10), one has 


LEMMA |. Jf a compact zero-dimensional space E possesses a uniform structure 
consisting of a decreasing sequence a, > a: > of open-closed equivalenc« 
relations for which a, is the all-relation on E and each (a;_; 1: a,) equals 2, 
then E is homeomorphic to the space of 2-adic integers 


According to (1), this homeomorphism is given by the following method 
The a,-classes in each a,_;-class are taken to be numbered, in a fixed manner, 
by 0 and 1; and for each x € E, c(x) is defined as the number a,42(x) in 

i(x). Then, E can be mapped by 


x— p(x) = im; Cy (Xx )z 


ko 


A 


into the ring ¥ of all power series in an indeterminate z with coefficients 0 
and | from the prime field of characteristic 2. This mapping is a homeomor 
phism of E onto 8 if $ is taken with its topology of formal convergence 
In this topology, however, ¥ is homeomorphic to the space of 2-adic integers 

Lemma | can be strengthened slightly: One can replace the hypothesis 
(a): a@i41) = 2 by the weaker condition 


(a4: @i41) = 2", 
with some natural numbers »,, for in this case there are, for each 1, sequences 
a,= Pi > Ber... D> Bay = Ais 
between a, and a4; satisfying (8,: By.1) = 2. 


A further result from (1) needed here is: 


LEMMA 2. A space E is zero-dimensional if and only if its open-closed 
equivalence relations form a uniform structure of EX. A compact E is zero- 











10 BERNHARD BANASCHEWSKI 


dimensional if and only if it has a uniform structure consisting of a decreasing 
sequence of open-closed equivalence relations of finite index 


Of course, the second part of this statement would no longer be true if the 
condition, always implicitly assumed here, that E have a denumerable basis 
were not satisfied. 

Finally, a metric |x, y| on a set E is called non-archimedean, if it satisfies 
the condition |x, y| < max{|x, z|, |z, y|} for any x, y and z from E. It is well 
known that any non-archimedean metric space is zero-dimensional. 


3. The compact case. In order to prove the first statement in §1 it 
is now sufficient to show that any compact E of dimension zero and dense in 
itself possesses a uniform structure consisting of open-closed a, such that 
a; > a1 and (a;: a4) is always a power of 2. 

Let a; (¢ = 1,2,...) be a decreasing sequence of relations on E£ as given by 
Lemma 2. Since E is dense in itself, any a,-class must consist of more than just 
one point. Therefore, any fixed a,-class contains an arbitrarily large number of 
a,-classes for suitably large k. From this, it can be deduced that there is also 
a decreasing sequence of open-closed equivalence relations 8; such that 6; <a 
8; > a,;, for some suitable n(7), and (8;: Bi41) is a power of 2. 

Suppose that the first k members 8; > 62 >... > 8, of this new sequence 
have already been determined. Then, by assumption, By > ay). If (By: a, 


is defined and a power of 2, one can take 


Brat = Que, mk + 1) = n(k) + 1. 
Otherwise, let m be the largest number of a,,,,)-classes contained in any 6,-class 
and 2* > m. In any of the finitely many §,-classes B, let mz be the number 


of a, «.)-classes and C a fixed one of these. Now, for a sufficiently large /,, C 


contains more than 2° — mg + | a@,,-classes. By forming, if necessary, unions 
of these, one can obtain a decomposition of C into exactly 2° — m, + | 
open-closed sets. These, together with the mz — | a,,,)-classes in B other than 


C decompose B into 2* open-closed sets, and taking this for each B, one has a 
decomposition of this kind for E. The corresponding relation 8 is open-closed, 
satisfies B < a,4, because of B < ayy) < Be < a and also B > a, for any / 
greater than all /,. Hence, one can put 6,4; = 8 and n(k + 1) equal to, say, 
the first number greater than the /,. 

This completes the proof, since it was assumed that a, is the all-relation 
and §;, therefore, can be taken as a;. That the sequence 8; forms a uniform 
structure of E is, of course, an immediate consequence of B; < a;. 

As a corollary one has: Any totally bounded non-archimedean metric spac 
which is dense in itself is metrically equivalent to a subspace of the metric space of 
2-adic integers. For a space E of this type has a zero-dimensional compact 
E as its metric completion which is also dense in itself, therefore homeomorphic 
to the space of 2-adic integers and hence metrically equivalent to it with 
respect to its metric induced from FE and the natural metric for the 2-adi 


ng 


SPACES OF DIMENS,ON ZERO 11 


integers in the latter. Also, since the space of p-adic integers for any prime 
ideal p of any number field is compact, dense in itself and has a denumerable 
basis, one obtains as a further consequence: For any ), the space of p-adic 
integers 1s homeomorphic to the space of 2-adic integers. 

The first of these corollaries can be regarded as a partial strengthening of a 
theorem by Urysohn (6; vol. I, §23) according to which any zero-dimensional 
space is homeomorphic to a subspace of Cantor's compact zero-dimensional 


space which is, of course, homeomorphic to the space of 2-adic integers 


4. The non-compact locally compact case. I{ £/ is zero-dimensional, 
and not compact but locally compact, then it is the union of denumerably 
many disjoint open-closed compact sets: Since E has a denumerable basis 
for its open sets, it also has such a basis B consisting of open-closed sets 
If, then, for each x E, V, is an open neighbourhood with compact closure 
and B, 8 such that a B, C V,, these B, are compact open-closed and 
have & as their union. Furthermore, there are only denumerably many of 
them and, hence, they can be arranged in a sequence B,, i = 1, 2 Now 


the disjoint sets 


still have & as their union and are open-closed compact 

The considered space E being dense in itself, each of these B*,, since it is 
open in £, must also be dense in itself and therefore homeomorphic to the 
space of 2-adic integers. Furthermore, as the B*, are open-closed, E is the 
topological sum in the sense of (3; chap. 1) of its compact subspaces B* 
hence homeomorphic to the sum of denumerably many spaces of 2-adi« 
integers. Finally, as the space of the 2-adic numbers is itself a space of this 
type, E is homeomorphic to it 

In exactly the same manner as above one obtains the corollary: For any )», 
the space of ~-adic numbers is homeomorphic to the space of 2-adic numbers 


5. The number of distinct non-archimedean metrics of a zero- 
dimensional space. Let E be the space in question. It possesses a sequence 
a; >a, >... of open-closed equivalence relations where (\a;(x) vr and 
the a,(x) form a neighbourhood basis for each x E, originating from one ol 
its non-archimedean metrics which are known to exist (1). The connection 
between the a, and the metric is such that 


x, y| = 2-* if xayy, l 3 jk; tA R+1 


defines an equivalent metric (1). If, now, each |a,| is finite, £ is totally bounded 
with respect to this metric. Therefore, under the hypothesis that £ is not 
totally bounded in each of its non-archimedean metrics (the other case will 
be considered later on) one can assume that, for some i, E decomposes into 


infinitely many a,-classes. Obviously, no generality is lost by taking | 











42 BERNHARD BANASCHEWSKI 


Since E is separable, the open-closed a;-classes are a denumerable collection 
of sets, say, C;, C2,.... Then, with respect to a given increasing sequence 
ki, ko,... of natural numbers, one can decompose C;, into its a,,-classes. 
The decomposition of E into open-closed sets thus obtained gives an open- 
closed equivalence relation 8, which can be used to define the sequence 
8, = Bi A a, where A denotes taking the lattice theoretic meet of two equi- 
valence relations (2). In the manner given by the formula (*), the sequence 
8, > B2 >... defines a new non-archimedean metric on E. The number of 
metrics that can be obtained in this way is equal to the number of increasing 
sequences of natural integers, hence equal to the cardinal number c of the 
continuum. However, these ¢ different metrics need not all be inequivalent 
to each other. In order to prove the assertion stated in $1 it will now be shown 
that this set of metrics splits into ¢ different equivalence classes. 

Let k*; and k, be two different increasing sequences of natural integers and 
8*,., 8. the two corresponding sequences of open-closed equivalence relations. 
The metrics defined by 8*, and 8, will be equivalent if and only if to each 8, 
there exists a 6*, < 8, and vice versa. In particular, one then has a relation 
of the type 8; > 8*; > 8 with suitable i and k. Now, by definition of 8,, 
the 8,-classes contained in C,, will be equal to the 8;-classes in C,, for al! 
sufficiently large m: as 8B, = 8; A a,, the 8,-classes on C,, are intersections of 
a,,.-classes and a,-classes. If m is large enough, one has a,,, < a, anyway, 
so these intersections will merely be a,,-classes, and these are also the 8)- 
classes in C,,. The relation 6; > 8*; > 8, therefore implies that the 6* ,-classes 
in all C,, for sufficiently large m are also equal to the 6)-classes, and this then 
gives the result: From a certain m = my, onwards f, and §*; are equal in C,,. 

Now, let the sequence a; > a: >. satisfy this further condition: Any 
a,-class decomposes into more than one a;,;-class. Then, 8; and 8*; can only 
be equal on C,, if k,, = k*,,. In this case, therefore, one obtains that the two 
sequences k, and k*, are equal from a suitable index onwards. Since to any 
increasing sequence of natural integers there exist only denumerably many 
other such sequences coinciding with it from a suitable index onwards, the ¢ 
different metrics defined above group into equivalence classes of at most 
denumerably many metrics each; the number of these classes will then still 
be c. 

The restriction just placed on the sequence a; > az > .. . can be shown to 
be satisfied if not by the a; themselves, then at least by suitable modifications 
of them. The basis for this will be that each open-closed set in E, being infinite 
since E is dense in itself, can be decomposed irito two open-closed sets. Using 
this, one may define new relations a*, in the following way: a*; = a. If 
a*, is already defined such that a*, < a,, decompose each a*, A a,+1-class 
into two open-closed sets and define a*,,, by the resulting decomposition of E 
The sequence a*; > a*, >... has all desired properties and using it, if 
necessary, in place of the original a, one obtains the existence of ¢ inequivalent 
non-archimedean metrics on E. 









SPACES OF DIMENSION ZERO 


Che remaining case to be considered is that £ is totally bounded in all its 


non-archimedean metrics. This property implies, as will be shown now, the 
compactness of E. Let a; > a; > ... be chosen as above and take any open 
closed equivalence relation a on E. As the sequence a, defines a non-archi- 
medean metric on E by (*) so does the sequence a’; = a A a,. E being totally 
bounded in this metric, there are only finitely many a-classes. Hence, am 
decomposition of E into open-closed sets must be finite. From this it follows 
that any denumerable open-closed covering of E contains a finite covering, 


for if 


B, open- losed, then the 


give a decomposition of £ into open-closed sets which will, of course, only 
be finite if B*, = @ and therefore 


UB, 2B 
i i 

from some ¢ = i» onwards; this gives 
LU) B, = E. 


Finally, as £ is zero-dimensional, one concludes from this that any open 
covering of £, having a denumerable open-closed covering as a refinement, 
contains a finite covering. E therefore is compact 

In all, it is then proved that a zero-dimensional space which is dense in 
itself either has at least ¢ inequivalent non-archimedean metrics or is compact, 


in which case, of course, all its metrics are equivalent. 


6. Imbeddings by metric equivalences. Let ? now be the ring of all 
formal power series 


with integral coefficients, taken with its topology of formal convergence 
rhis space $ is a universal space for the separable non-archimedean metric 
spaces in the sense that any such space can be mapped into $ by a metric 
equivalence. The mapping which will do this can again be defined as follows 
(1): 

Let a) > a, >... be a sequence of open-closed equivalence relations on 
resulting from its given’ metric. Then, the a;-classes in the different a,_- 
classes can be regarded as numbered in a fixed manner. With respect to this 
numbering, let c,(x) be the number of a,(x) in a,—1(x) for x E:; then, in 
exactly the same way as in §1 in the special case of the compact spaces 
Lemma 1) the mapping 








14 BERNHARD BANASCHEWSKI 


p:x— p(x) = D> a (x)z’ 


is a metric equivalence of E into $. Of course, in special cases, p may even 
be a metric isomorphism, but since the sequence a; > az > ... determines 
the metric only up to equivalence, this will not be so in general. 

¥ is obviously itself a separable non-archimedean metric space, since the 
denumerable set of all integral polynomials is dense in %. Therefore $ is, in 
a sense, a minimal universal space for this type of space and, of course, 
characterized by this property. 

For a more restricted class of metric spaces than the one just considered, one 
can obtain a universal space of an even simpler nature than the space ¥. 
A non-archimedean metric on a separable space E will be called evenly locally 
compact if it can be represented — up to equivalence — by a decreasing sequence 
a, > a. > ... of open-closed equivalence relations such that a;(x) is compact 
for each x FE and the (necessarily finite) indices (@,~::a@,) exist (see §2). 
Then, the following holds: Each separable space with an evenly locally compact 
non-archimedean metric is metrically equivalent to a subspace of the metric space 
of 2-adic numbers. The class of spaces admitted here includes, of course, all 
the m-adic metric spaces for any natural integer m and, more generally, the 
spaces of all separable locally compact groups whose topologies are given by a 
denumerable decreasing sequence of invariant subgroups as neighbourhood 
basis for the unit element. 

By the preceding construction, a space E of the type now considered is 
metrically equivalent to a subset of $8 contained in the set given by all 

Zz GyZ , a,c a. < i. 
n>0 
where jo denotes the (possibly infinite) number of a,-classes in £ and j,, 


n > 1, the index (a,: a@,41). Now, for any natural integer a one has 
a = go(a) + ¢i(a) 2 4 . + ¢.(a) 2", 


¢i(a) equal to 0 or 1, with some suitable s. In particular, then, any a, 
0 <a < },, can be written as 


go(a) + gi(a) 2+... + ¢,,(a2) 2 forn > 1. 


Using this, we can define a mapping ¢ of 


w= a,2 , : 


Ir 
~ 
r~) 
A 
Se. 
3 
= 
Il 
= 
te 


by 


so sot+l 1 
g(w) = ¢s, (do) Zz + ¥ so 1(@o) 2 Tr... ¢1\d¢) 2 T olde) 


Sn 


+ >> (¢:(a,) 2 + ¢g2(a,) 2 + -+¢ Sie + Sn 
n=1 


¢(w) is an element of the ring ¥ of all Laurent forms 








SPACES OF DIMENSION ZERO 


in z with integral coefficients, which, again, will be taken with the topology 


of formal convergence. Now, w; = w2(mod z*t') implies a, = b,, n = 0. 
BO ae e, for the coefficients a, of w;, and b, of we and therefore ¢,(a,) = ¢ 
(b,) for n = 0,1,..., e and all corresponding 7. From this it follows thay 
81 +8e+ +a,+1 
¢(w,;) = ¢(we) (mod 2" ). 


Similarly, the converse holds, and since the 


lial 

define a metric equivalent to that defined by the (z*), ¢ is a metric equivalence. 
Furthermore, all g(w) lie in a part of ¥ which is itself metrically equivalent 
to the 2-adic metric space. This proves the above assertion. 

In the case of non-compact separable evenly locally compact non-archi- 
medean metric spaces which are dense in themselves one can easily obtain a 
much stronger result: Any such space is metrically equivalent to the metric space 
of 2-adic numbers. A space E of this type is, of course, the sum of its compact 
open-closed a;-classes (see above) A,, i = 1, 2, ., and each of these is 
homeomorphic to the space of 2-adic integers. If C,, i = 1,2,..., is the 
complete system of residue classes, with respect to addition, of all 2-adic 
numbers modulo the 2-adic integers, then K , can be mapped homeomorphically 
onto C, for each 71. This gives a mapping y defined on E which is a metric 
equivalence. For, up to equivalence between metrics on £, the K, can be 
taken to he the ‘‘unit spheres” in £, and this is what the C; are in the 2-adic 
metric space. Then, since y carries unit spheres into unit spheres and is, in its 
restriction to these, a metric equivalence, it also is this for E as a whole. 

In particular, this proves the last statement in §1 since the n-adic metric 
spaces are of the type just considered. More so, one can say that there are 
only two essentially different, i.e., inequivalent, separable and evenly locally 
compact non-archimedean metric spaces which are dense in themselves: The 
metric spaces of 2-adic integers and 2-adic numbers. 

As a concluding remark, it may be pointed out that the metric product 
spaces of any finite number of p,-adic metric spaces studied in (5) are all 
metrically equivalent to each other. This follows from the last remark and the 
fact that the metric product of p,-adic metric spaces (7 = 1,2,...,7), is 
metrically equivalent to the pipe . . . p,-adic metric space, which is an immed- 
iate consequence of (4; p. 96, Satz 5). There is a remarkable contrast between 
this and the results in (5) according to which an isometric mapping of one 
product of this type into another one can only exist if the product of the 
corresponding sets of prime numbers for the first space is less than or equal 
to that for the second space. This, then, goes to show that metric equivalence 
leads to much wider classes than the more restrictive concept of isometry 





46 BERNHARD BANASCHEWSKI 


REFERENCES 


. M. Deuring, Algebren (Berlin, 1935) 
G. E. N. Fox, The topology of p-adic number fields, doctoral thesis, 


Hamilton College, 
McMaster University, 
Hamilton, Ontario 





. B. Banaschewski, Ueber nulldimensionale Riume, Math. Nachr. 13 (1955), 129-140 
. G. Birkhoff, Lattice theory, Amer. Math. Soc. Coll. Publ. XXV (1948). 
N. Bourbaki, Topologie générale, chaps. I and II, Act. sci. et ind. (Paris, 1948) 


McGill University, 


i955 
C. Kuratowski, Topologie | and Il, Monografie Matematyezne, XX and XXI (Warsaw 
1948 and 1950). 





























MATRICES WITH ELEMENTS IN A BOOLEAN RING 
\. T. BUTSON 


1. Introduction. Let 8 be a Boolean ring of at least two elements con- 

taining a unit 1. Form the set 2 of matrices A, B,. of order n having 

’ entries a;,, by)... (i,j = 1,2,...,m), which are members of SB. A matrix 

U of M is called unimodular if there exists a matrix V of IM such that VU = /, 

the identity matrix. Two matrices A and B are said to be /eft-associates if 

there exists a unimodular matrix U satisfying UA = B. The main results 

in this paper are the constructions of two canonical forms for left-associated 

matrices of I. The first form may be described very simply; however, it 

' lacks the desirable property of containing the maximum possible number of 

rows which consist entirely of 0's. Although the second has this property, 

its description is quite complicated. They are somewhat similar to the well- 

known Hermite form for matrices with elements in a principal ideal ring 

(4); and, accordingly, use is made of them to establish analogues of several 

other familiar results concerning matrices with elements in a principal ideal 

, ring. Although row equivalence (left-associativity) and a diagonal canonical 

form for equivalent matrices of I? are mentioned in (2, pp. 164-165), the 
author has been unable to locate his results anywhere in the literature 


2. Properties of 8. A Boolean ring may be defined as a ring whose cle- 
ments are all idempotent. It is easily shown, see (2, pp. 154-155), that it is a 
commutative ring of characteristic two, in the usual sense. Then for any x 
| in B, the element x’ = | + x, called the complement of x, satisfies x + x’ = 1, 
vx’ = 0, and (x’)’ = x. Bell (1) observed that x y y = x + x’y is the g.c.d 
of x and y. Following is a summary of the less obvious but easily established 
properties of 8 which we shall use in the sequel 
(2.1) xx = <x; 
(2.2) xy = yx; 


(23) x+x=0;: 
(2.4) x+x’ = 1,xx’ = 0, (x’)’ =x; 


(2.5) Wx, = XV X2V... VX_ = Hy + MiKo + KI +... + xis... xia, 
i=1 
is the g.c.d. of 21, Xs, . . . , Xa; 
n t 
(2.6) Dx Vx.J = Dox, ¢=1,2,..., n; 
j=1 f=] 1 


Xeceived March 4, 1956 











+8 A. T. BUTSON 


n , n” 
(2.7) ( V x.) ™ XiXo...Xm, (X1Xo...X_)' = x3; 
1 


(2.8) Vx, = Oif and only ifx,; =x, =...=-x, =0; 


i=] 
(2.9) xy = Oif and only if xy’ = x. 


3. Canonical forms. In constructing the canonical forms only one type 
of elementary operation is needed, the addition to the elements of a row of x 
times the corresponding elements of another row, x being in 8. Furthermore, 
this elementary operation can be accomplished by multiplying the given 
matrix on the left by an elementary matrix, namely the matrix obtained by 
performing the desired elementary operation upon the identity matrix / 
If E is any elementary matrix, it follows from (2.3) that EE = J. Quite 
obviously then, any elementary matrix is unimodular, and a product of uni- 
modular matrices is unimodular. To facilitate describing the constructions, 
we first establish a lemma. 








LEMMA 3.1. For 0 <7 <n, let A(j) = |B(j) H(n — j)] be the following 
matrix of M: 
lies Ges . by 0 0 . 
bo, boo .. des 0 0 0) 
b., b b 0 0 0 
Disa Dyas ~ Oars Berra. car O 0 
Disr)es Oise » Bas h ih 0 
| Bat b,,» © he. sat  -_ hh.» 
where A = A(n) = |B(n) H(0)|, H = A(O) = [B(O) H(n)], and h,, h,, = 0, 
Reus Ruy = Rec for p=q + 1, q+ _ 4 »n; g = j +1,74+2.,. n. Then 


there exists a unimodular matrix U, (which is a product of elementary matrices 
such that multiplying A(j) on the left by U, leaves the last n — j columns of 
A(j) invariant, and replaces the elements b,, of the jth column of A(j) by elements 
h,;, where hk; = 0 for k= 1,2,...,j 1; and hyjh,, = 0, heer = hy; for 
k=j+i1j+2,..., n. (In terms of matrices we have 


A(j — 1) = U;,A(j) = U,{|B(j) H(n j)) = (BG -— 1) H(n —7 4+ 1)), 
where it is to be understood that although H(n — 7) is a submatrix of H(n — 7+ 1 
B(j — 1) ts not necessarily a submatrix of B(j 

Let E,,; denote the elementary matrix obtained from J by adding x,, times 
the elements of the kth row to the corresponding elements of the jth row, 


/ 
where x1; = 5 


b5 Di sda; . . - Dir, 4, &=72,3,..., j-i; 


ej 


Il 


Xny = hiends 9; .. . O-1.5, kR=j+1, j+2,...,m. 





ing 








a 


MATRICES WITH ELEMENTS IN A BOOLEAN RING +) 





It is quite obvious that adding x,, times the elements of the kth row to the 


corresponding elements of the jth row, for k = 1,2 .7 — 1, does not 
lect the last » —j columns of A(j). For q=j74+1, 742, ik 
b p+1,7+2,...,m; 


Ce eg = Wards gba; .. . Ont, Mee = Reghiindi bs, ... bt; 
= h, Nyx Mexd) 905 ; eee b 4 => 0 
Hence adding x,, times the elements of the kth row to the corresponding 
elements of the jth row, for k = 7 + 1,7 + 2, , n, does not affect the last 
n — j columns of A(j) either. Then multiplying A(j7) on the left by the uni- 
modular matrix 


E, = E,;E.-1 EF j41.7 Ey | DEST 
leaves the last m — j columns unaltered, and replaces 6 ,,; by 
hy, = by Vv... Wby, V(b ja dees: saayv... V(b, e-.). 
Let F,,, fork = 1,2,...,7 —l,g7 + 1,...,m, denote the elementary matrix 


obtained from J by adding 6,,; times the elements of the jth row to the corre 
sponding elements of the kth row. Multiplication of /,A(j) on the left by 


F,, obviously leaves the last m — j columns invariant, and replaces ),, by 
hy; = dey + Oy shy; By (2.6) and (2.3) hy; = by + D, 0 for k = 1,2 
i— 1. Fork =7+1,74+2,...,m; 
h,h = (b,, + db, h,,)h,, = dy sh + ho hh, = 0), 
Using (2.7) we can write 


hyp; = (dg; + by jhy;) = by sh’, 
= by (dy Mhn)'d4 BS, «Os (Bya1, Mar, 541) 
(dps Mi s.n—1)’ (Beas, Aeeer.eci)’... (baht, )’: 
and since 
by (bp Mice)’ = by s(1 4+ Oe ghin) = bey 4 dy shi 
= bpj(1 + Ain) = Oesher, 


Dies = Dy Myr d) OS, . «Oj (O 541, Mar. s41)’ 


(by—1, Mie—1.e—1)" (besa, estnei)’ . ~~ (Onslinn)’, 
and it is now obvious that h, A, = h,,. Letting 
F = F,, ;F 1 se F +1 \F ! "e® Fy F, ’ 


it is apparent that F,E, is the desired unimodular matrix [ 
We remark that if h,;, = 0, then by (2.8) 


b, = b» =...=5,,= 0, (by shiz) = O, R=j+1,j34+2,.. ,, 


whence by (2.9) bh... = b,,. So in this particular case we may choose / / 











50 A. T. BUTSON 


We also note that if 4,, = 0, then the requirement that h,,h,, = h,, implies 
that h,, = Oforg = j3+1,j)4+2,...,p. 

THEOREM 3.1. For any matrix A of IW there exists a unimodular matrix | 
of Mt which is a product of elementary matrices and such that UA = H has the 
following properties: h,, = 0 for q > p, Nyghyg = 0, and hy hy, = Nyy. (Note 


that if a diagonal element is 0, then the entire row consists of 0's). This form 
H is unique. 


Successive applications of Lemma 3.1 to A = A(n) for j = n,n So aaa 
vields A(0) = [B(O) H(n)| = Hand U,V, U,, = Uas the desired matrices 

To prove the uniqueness of H, let l/ and V be unimodular matrices such that 
UA = H and VA =G each have the form described above. (The result 
that a unimodular matrix has an inverse is implied by the succeeding corollary, 
which is established without assuming the uniqueness of H. To simplify 
matters, we use this now.) Then U-'H = A = I~'G, and PH = G, OG = I, 
where P = VU~' and Q = UI". Thus, for fixed 7, the following svstems of 
equations must be satisfied: 


n 


D> Pubes = gin > dager =h ts {= ye aa » We 
eo t ‘ 


where g;, = h;, = 0 for ¢ > i. Consider the first system. The last equation 
P inltan = 0 and the condition h, Ay, = h,, imply p,,h,, = 0 for t 1.2. n 
Thus the first system is equivalent to 


The last equation —;.»-1:4»~-1..-1 = 0 of this system and the condition h, 
Reg—1n—-1 = Aai.¢ imply Pin—shy-1., Q for ¢ = 1, 2, ,n— 1. Thus this 
svstem may be reduced to 


D Pubes = Lin té=1,; 


2 n 2 
Continuing this reduction for ¢ = n — 2, — 3,..., t+ 1 yields pyh,y, = 0 
for k>1, ¢ = 1,2,...,m, and replaces the first system by the equivalent 


svstem 


> Publier = Lin t=1,2... 


Similarly, the second system is equivalent to 


i 


D Gieker = hiv, t=1,2,...,1 
i t 
Now Pihis = gic and Gagig = hay imply 
ii = Pihis = Pi = PuQiuPish 


= JiPuh ViZii = h 


Th 
Sim 


Mu 


sin 


He 


Ne 


Th 


it 
It 


MATRICES WITH ELEMENTS IN A BOOLEAN RING 51 


Thus piss = Aes, and Piss = hy, for t = 1,2,....7, since hyh h 
Similarly, gies = Zee ANd QieBir = Kir. Now consider 


Pieris H Pain = Bie 
Gi 12 i—1.4-1 1 JiR = Ay 1 


Multiplying by A ; and g; ;-1, respectively, gives 


Ing cna = Wy i128 i—s Ziiar=h ig 1s 


since Ay ;-yAi-1.i-1 = 21.4-121-1.1-1 = 0, and p,jh 1=h 1s Jik 1=g 
Hence g 1 = hy «-1, and also py y-shi-1.5-1 = 9 iZi-1,1-1 = 0 which implies 
p Weis = iZi-1,s = V, f ‘a s | 


Next we consider 


Pii—oh; , 2 + Pas sh, 1 ot pithy; >= 8s 

Ji i—28 1-2, 1-2 HF Vi, 18-1, 1-2 F Vi iie = 
which are simply p;, ;~2hi~2,;-2 + Ai i-2 = Bi,i-2 and Qj, )-28)-2;-2 + @ 
h; ,-». Multiplying by h, ;-2 and g;,;-2, respectively, gives 

he see = £1.1-#i1-2, £ 2 = h, sf; 
Then g = kh, , and py y-ahti-2 4-2 = Gii-2i-2.1-2 = O which implies 
Pi.i-#hi-2.1 = 9 i-2., = 0, f= 1,2, s 2 

Continuing this procedure yields g;, = h;, for ¢ = 1, 2, , 1. Now letting 


range from | to m establishes the identity of G and H. Hence // is unique. 


COROLLARY 3.1. Every unimodular matrix of Mis a product of a finite number 
of elementary matrices. 

If the matrix A in the above theorem is unimodular, then UA = H, being 
a product of unimodular matrices, is also unimodular. Then there exists a 
matrix K such that KH = J. The properties of the elements of H/ are restrictive 


enough to require that H = K = J. Since U is a product of elementary 
matrices, say E,E,,...,, we have E,E,_, .E,A =I. Hence A = E, 
E,... E,, the desired result. We remark that it is now obvious that Al’ = / 


so that U = A. 

The canonical form H does not have, in general, the maximum possible 
number of rows whose elements are all 0's that could be obtained by elemen 
tary row operations on A. The succeeding lemma makes this apparent. Our 
procedure now will be to obtain a second canonical form for A by performing 
elementary operations on‘H that will replace a row wherever possible by a 
row of 0's and alter the form of H/ as little as possible 


LEMMA 3.2. Let H be the matrix described in the preceding theorem and 


h,;,h in, fo eo < <i, be the diagonal elements in the last 











wt 
~ 


Ss A. T. BUTSON 


n — j + 1 columns of H that are different from 0. Then a necessary and sufficient 
condition that there exists a unimodular matrix V,, such that multiplication of 
H on the left by V; replaces h,, by 0, and leaves invariant the last n — j columns 


of H and any row which consists entirely of 0's, is that hy hy, hy, 5, .. hy, ), = 9 


The most general sequence of elementary operations that could be performed 
on the rows of H and leave the necessary things invariant is: the addition of 
an arbitrary multiple, say x,,, of the elements of the jth row to the correspond- 
ing elements of j7,th row, for r = 1, 2,...,¢; then the addition of say y,j,h’ ;,;, 
where y,, is arbitrary, times the elements of the j,th row to the correspond- 
ing elements of the jth row, for r = 1, 2, _ t. This replaces h,,; by 


Mhyy + Dd vs Mies: Mijn s H+ X5Mss), 
l 


which is simply 
hy, +h > Y 1,X 3% etes 
! 
since 
, 
hj,3,83,3 = 0. 


In order to be able to replace h,,; by 0, under the required conditions it is then 
necessary that there exist 


vr, and y,, 2 Se 


such that 


t 


hy, +h, > ¥ 1,%3K-3, = 0. 


r=! 


By adding h,,; to both sides we obtain the equivalent condition 
t 
hyd. ¥s,X)h,5, = h 
1 
Since 


i 
. , , ae as if ’ 
V4, 5,5, 5, Vv Wiste = ¥5,% 52 be 4 


bv (2.6), we have 


t 


t t 
hy V hi.3, = (4,3 5s Kis) V 4,5. 
1 r=1 8 
= hud (, X;,hi,5, V W.1.) 
r=1 
= Wy QD VX uM, 


= hy;. 









MATRICES WITH ELEMENTS IN A BOOLEAN RING 


But this last relation implies, by (2.9), (2.7), and (2.4), that 


Ie ssh j, 5,4 tate + - hy, = 9. 


Hence the condition is necessary. 
Conversely, suppose 


h;,h, LA = (). 


wir? 
Phen 


t 
hss V Nis, = he 


1 


ind the aforementioned sequence of operations with 


repl ices h by 
ayy + yD, V5 -Xp Mise = Igy + Ayy VAS, 5, = yy + hy, = 0 
r=} r=l 


and leaves the necessary things invariant. Thus the condition is sufficient and 
the lemma is proved. 

Let us now determine precisely what happens to the elements in the first / 
columns of H when h,, is replaced by 0 in the manner described in Lemma 3.2 
Since 

h’ 


ip iH 0, 


ird@ 


~ 


h,,forg = 1,2,...,j — 1, is replaced by 
t 

hatha V hi: 
1 


lt is necessary that 


hidtnn ---Bnn = 9 
sO that 
t 
hag NV hij.5, = he 53. 
ral 
Using this and the fact that h,, = h,ht;;, we have 
P , 
Ihjg + Njg Vs, 5, = jg + Wjghtss V 
r=1 r= 
= Nyg + Iyghs; = Nyg + Nye 
= (). 
Thus replacing 4,;; by 0 replaces h,,, for g = 1, 2, J 1, by 0 also. For 
r=1,2,...,tandq = 1,2,...,7, h;,, is replaced by 


+h 








+ A. T. BUTSON 


We observe that 


+h, Mee = 0, 


"ed 


d j,qlae = (hig + hj, ¢)heg = Ajgh 


¢¢ 
so that the property of H7 that the product of an element with the diagona 
element above it be 0 is preserved. Although 


d; h;,;, #ad 
in general, we note that 
d jq(hjsV hj, j-) = (Aj th hyvh 
= hy (hy, V hj, 5.) + hy (hi V hy, 5, 
= Njghy (hy; V hy, 5.) + Aj hij, (Ay V Ny, 
= Nygh;; + hj, chs, j, = hig t+ hy, 
=d 
We also see that, for g = 1, 2,.. iE 
Nh qh jyj, Rigg jy = Mighty shy, hy, ;, = 90 
Hence h,,, is replaced by an element 
dig =higth 
such that 
djgteq =O, Nijghs; = Nj, highs, i, -- Aj, = 9, 
d 5 q( hy, V hy,3,) = d5¢ 


Now let H7; denote the matrix resulting from replacing h;; by 0 according 
to the procedure just described. We want to consider the problem of replacing 
a diagonal element, say h,;, of H,; by 0 using elementary operations that leave 
invariant the last » — 7 columns and any row whose elements are all 0's 
Let 


denote the diagonal elements of J7, between 
h;; and hy, 


which are not 0. When we attempt to parallel the discussion of Lemma 3.2 we 
find that, although we can add 


y:,h'.,4,, Where y;, is arbitrary, 
times the elements of the i,th row to the corresponding elements of the ith 
row, we can't add 

y;,h';,;,, Where y,, is arbitrary, 
times the elements of the j,th row to the corresponding elements of the ith 
row. In order to leave invariant the last i columns of 77, we must add 


instead 





ith 


ith 





MATRICES WITH ELEMENTS IN A BOOLEAN RING 


y;,4 shy, ;,, where y,, is arbitrary, 


times the elements of the j7,th row to the corresponding elements of the ith 
row. With only this change, however, we obtain the following result. A 
necessary and sufficient condition that h,,; can be replaced by 0, by means of 
elementary operations that leave invariant the last m — i columns and any 
row consisting of 0's, is that 


Wass is ~~» Mig Agy jy V 53) (Agee V G3) ~~~ Ajay V ys) = 0. 


If this condition is satished, to replace 4;,; by 0 we choose 

Xi, = X;, 5 a4 eee, | oe oa | 
lhat is, we first add the ith row to each succeeding row which does not consist 
entirely of 0's. Then a multiple of the elements of each of these rows is added 
to the corresponding elements of the ith row. Choosing the y's appropriately, 
this replaces hi(q = 1,2,...,2), by 


r t 
hig + huey Vhicv V (Wyss) =hiy thy = 90. 
ral 1 
We note that 
Rises ey ee ee 
Is replaced by 
hice + ies 
and 
lo g=1,2,...,%8 2 4,2,...,6, 
is replaced by 
dj.g + hig = Mga + hhge + hia. 
Denote this matrix by Hs 
We are now able to describe the procedure for obtaining from the first 


canonical form H the second canonical form, which we shall call C. Consider 


successively the products 


h h,,,h 


ivt ieee deg — PPh Jtst 


hy, i hs, eS 


ol the diagonal elements of H7 which are different from 0. If none of these are 0, 
then C = JJ. Otherwise, there is a first one, sa\ 
hy shy, 5, eee Nietes 


which is 0. In this case replace h,, by 0 according to the procedure described 
in Lemma 3.2. Let 
Z 


(Ms, i, V 155) Mijn jg V 153) «~~» (Aggy V Ny3)s 
and consider successively the products 


ey a 











ob \. T. BUTSON 


If all of these are 0, then C = H,. Otherwise, there is a first one, say 
hihi, --- hig Zy, 


which is 0. In this case, replace h;; by 0 as before. Let 


Zig = (hii Vii) ~~. Rana Vii) (hay, VAG V hia)... (hag Vay vh 
and let 
Te <= 2s < Be, 
denote the diagonal elements in the first 1 1 columns of [> which are not 0 


Consider successively the products 


oe ee eee ee 


If all of these are 0, then C = Hy». Otherwise, there is a first one, say 
ee Piss + + « Ms 
which is different from 0. Replace h,,,, by 0 in, what should be by now, the 


obvious manner. Continuing this procedure yields the desired matrix C 
Obviously each one of these steps can be accomplished by multiplying the 
particular 77, on the left by a unimodular matrix V,. Hence there is a unimodu 
lar matrix V such that VH = C 

Note that replacing h,, by 0 affects the element in the (p, g)-position of 
IT, only if g <t < p. Then, for c,, # 0, if we let 


Cajai» e2e2 - +9 Case 
denote the diagonal elements of C between ¢ »-1 and c,, which are 0, we 
see that 
Coa = Mpg + Ihaye H+ Meee 4 + h, 
(Although the c,.,.’s include any diagonal element that was originally 0 in J/ 
Sal\ 
a ' 

this does not aflect the representation of c,, since the corresponding / () 


We summarize all this in the following theorem 


PuEOREM 3.2. Let A be any matrix of Mand UA = I its first canonica 
form. Then there exists a unimodular matrix V such that WA = VUA VH=C, 
where W = VU, has the following form: c,, = 0 for q > p; if c,, = 0, then 
Cog = 0 forg = 1, 2, » ES nah 0; if Cop ~ O and 

Cosers Cosees > + - 1 & 
denote the diagonal elements of C between c, 1 and Cy, which are 0, then 
Cog = Ipg + Nga + Meee + th 


Furthermore, it is impossible to replace a diagonal element of C by 0 using clement 
ary operations that leave invariant the succeeding columns and any row which 
consists entirely of O's. This form C is unique 


ol 


we 


0) 


MATRICES WITH ELEMENTS IN A BOOLEAN RING a7 


rhe proof that C is unique proceeds along the same lines as the proof of 
the uniqueness of H, and will be omitted. We wish to emphasize, however, 
that to ensure uniqueness it is absolutely necessary to add the elements ol 
the kth row to the corresponding elements of each succeeding row whose 
elements are not all 0's, as the first step in replacing any diagonal element 
h,, by 0. 


THEOREM 3.3. A necessary and sufficient condition that two matrices A and B 


of M be left-associates is that they have the same canonical form H (or C 

lf PB = A, where P is unimodular, let U be a unimodular matrix such that 
UA = H is the first canonical form of A. Then H = UPB = VB, where 
’ = UP is unimodular, so that 7/7 is the first canonical form of B also 


Conversely, suppose that E and F are unimodular matrices such that EA 
FB = H is the first canonical form of A and of B. Then QB = A, where 
Q = E~'F is unimodular, and A and B are left-associates 


4. Mutual left-divisibility, g.c.r.d., and l.c.lkm. ‘wo matrices A and B 
of WM are said to be mutually left-divisible if and only if there exist matrices 
R and T of M such that RA = B and TB = A. It is well known that the 
concepts of mutual left-divisibility and left-associativity are equivalent for 
matrices with elements in a principal ideal ring. Steinitz (5) has shown their 
equivalence for matrices with elements in an algebraic domain. Kaplansky 
(3) considered this problem and obtained some results based on the radical 
of a ring. We now show that the two concepts are equivalent for matrices of 2 
lf A and B are left-associates so that PA = B, where P is unimodular, then 
P-'B = A and A and B are mutually left-divisible. Conversely, suppos« 
RA = Band TB = A. Let UA = Hand VB = G be the first canonical forms 
of A and B, respectively. Then A = U~'Hand B = V-'Gimply RU-'H=V-'G 
and T7V-'G = U-'H. Whence, PH = G and QG = H, where P VR 


and Q = UTV~'; that is, H and G are mutually left-divisible. In proving the 
uniqueness of the first canonical form TH, we showed that PH G and 
VG = II, where P and Q are unimodular, imply HW = G. However, the uni 


nodularity of P and Q was not used anywhere in this proof. Hence, we estab 
lished at that point also that if HW and G are mutually left-divisible, then 
H = G. This enables us to state the following result 


THEOREM 4.1. A necessary and sufficient condition that two matrices A and B 
of WM be mutually left-divisible is that they be left-associates 


A 0. 
B O 


ol order 2n. Then there exists a unimodular matrix X of order 2”, which we 
write in the form of » X n blocks, such that 


Let us now consider the matrix 











oe \. T. BUTSON 


Xu XuVA 0 D 0 

Xo Xun B Oi i@ O 

is the first canonical form of the above matrix. Thus 
XA + X»B = D 


so that every c.r.d. of A and B is a right divisor of D. Since X is unimodular 
there exists a matrix Y = X~' such that 


A 0 VY, YreiD O 
Fr 4 7" = bg 4 . 
whence A = Y,,D, B = YD, so that D is ac.r.d. of A and B. Hence D isa 
v.c.r.d. of A and B. 

The matrix M = X.,A XB is ac.l.m. of A and B. Using an argument 
due to Stewart (6), we are able to show that J/ is the l.c.l.m. of A and B 
when D = /. To do this, let WM, UA VB be any other c.l.m. of A and 
B. We can then write the following equations 


Xi. Xf A O DO 
l Vv TB Oo} LO Of’ 
Xu Xi py Yi D 0 D 0) 
l Vv -¥n Yeo o}” Lo o] 
Consider the most general solution of the equation 
Zi Ze D O D 0 
Zx ZofO0 O} LO O}° 
Here Z,. and Zo. are arbitrary, but Z,, and Zs; must be chosen so that 


ZuD=D, ZyD=0 


Subject to these conditions the following equations must hold 
Xi Xi2f Y Y; Zi Zi 
l V Vo, YVoo_ Zn Ze |’ 
Xi X 12 Zi Zi2 X11 xX; 
l V eke Zatka Xa 

In particular, it appears that U has the form 


U = 2ZmnX + Z22X 


But if D I the only solution of Z.,;D 0 is Za 0. Hence it follows in 
this case that U = ZoeX.,; then from M, = UA = ZaXnA = Ze2oM it 
follows that \/ XA = XB is indeed a l|.c.l.m. of A and B. These results 
are stated in the following theorem 


THEOREM 4.2. Jn the matric equation 


X11 X12 A 0 D 0 
Xn Xo} B OJ} [LO Of’ 








MATRICES WITH ELEMENTS IN A BOOLEAN RING 


written in the form of n Xn blocks, where X is unimodular, the matrix D in 
all cases a g.c.r.d. of A and B; if D = I, then the matrix M XA X »B 
is a Lec.l.m. of A and B. 


THEOREM 4.3 The x.c.r.d. D and the \.c.l.m. M of two matrices A and B 


are uniquely determined up to unimodular left factors. 


{{ D and D, are two g.c.r.d.’s of A and B, then each is a c.l.m. of the other, 
say D = UD, and D, VD. Then by theorem 4.1, D and D, are left 
issociates. 

If M and M, are two L.c.l.m.'s of A and B, then each is a common right 
divisor of the other, say M, = UM and M = VM,. Then by Theorem 4.1, 
Vand M, are left-associates. 


5. Conclusion. The analogy of our results to the corresponding ones for 
the classical case seems remarkable when one considers that a principal ideal 
ring contains no proper divisors of 0, whereas every element of a Booleai 
ring except 1 and 0 is a proper divisor of 0. Finally we mention that the restri: 


tion to square matrices was inessential 


REFERENCES 


1. k. VT. Bell, Arithmetic of logic, Vrans. Amer. Math. Soc., 29 (1927), 597-611 

2. G. Birkhofi, Lattice Theory, Amer. Math. Soc. Colloq. Publ., 24 (rev. ed., New York, 1948 

3. I. Kaplansky, Elementary divisors and modules, Trans. Amer. Math. Soc., 66 (1949 
164-491 


4. C.C. MacDuffee, Matrices wish elements in a principal ideal ring, Bull. Amer. Math. So« 
9 (1935), 564-584 

5. I. Steinitz, Rechteckige Systeme und Moduln in algebraischen Zahlkérpern, Natt 
(1911), 328-354, and 72 (1912), 297-345 

6. Bb. M. Stewart, A note on least common Iefi multiples, Bull. \mer. Math. So > (1949 
587-591. 


University of Florida 











CHARACTERISTIC POLYNOMIALS 
HANS SCHNEIDER 


Introduction. Let F be a field and let V be a finite dimensional vector 
space over F which is also a module over the ring Fla]. Here a may lie in any 
extension ring of F. We do not assume, as yet, that V is a faithful module, so 
that a need not be a linear transformation on V. It is known that by means 
of a decomposition of V into cyclic F{a]-modules we may obtain a definition 
of the characteristic polynomial of a on V which does not involve deter- 
minants. In this note we shall give another non-determinantal definition of the 
characteristic polynomial. Instead of considering a single module V, we shall 
accordingly study the set of all finite dimensional F[a]-modules and mappings 
of this set into monic polynomials with coefficients in F. Admittedly our 
procedure does not yield the theory of the elementary divisors of a, but it 
has certain advantages. First, all questions of uniqueness are settled 
immediately by the Jordan-Hélder theorem. Secondly, it is possible to derive 
some classical results, usually proved using determinants, without excessive 
labour. To illustrate the use of our method we shall complete and generalise 
some results due to Goldhaber (2) and Osborne (5). 


1. A principal ideal in a semigroup of mappings. Let % be the set ol 
all finite dimensional F[a]-modules, and let 2 be the multiplicative semigroup 
of (non-zero) monic polynomials with coefficients in F. Let T be the set of 
mappings of ¥ into It. We shall assume throughout that if y is a mapping ol 
tT, and if V; and V:2 are isomorphic modules of B, then ¥(V;, 1) = (V2, t 
The set T becomes a semigroup if we define multiplication in the obvious way, 
viz. WiWe(V, t) = Wi V, t)-W2(V, ¢) for all V in B. The subset S of T consisting 
of all mappings satisfying 


(1) ¥(V,t) = ¥(V/Z, t)-v(Z, t) 

for all V © ¥% and all submodules Z of V, is also a semigroup. Now let py be 
any mapping of T such that u((0), 7) = | and 

(2) u(V,t) divides u(V/Z, t)-u(Z, t) 

for all V ¥% and all submodules Z of V. Let © be the ideal in the semigroup 


~ 


S consisting of all y of S divisible by uw in T i.e. let €C = wI MGS. 
Then an element y of T belongs to € if and only if y satisfies (1) and 
(3) u(V, t) divides y(V, f) 


for all V Ys. 


Received March 1, 1956. I should like to express my gratitude to Dr. I. T. Adamson ane 
Ir. M. P. Drazin for discussing with me the results of this paper 


re 


Ip 


CHARACTERISTIC POLYNOMIALS 61 


Let V = VoD... D> V, = (QO) be a composition series for the F{a]-module 
V. Then it is an immediate consequence of the Jordan-Hélder theorem that 
the polynomial 


(4) x(V,t) = T] w(Vin/V;, 2) - u(), 8) 
i=l 


depends only on the module V and not on the composition series we use in 
the definition. (The last factor u((0), #) = 1 has been put in to cover the case 
l’ = (0), yielding x((0), ¢) = u((0), t) = 1. In some of the arguments below 
it has been assumed that V # (0), the case V = (0) being trivial.) 

This remark allows us to think of x as a mapping of B into M, i.e. as an 
element of the semigroup T. The semigroup © is not a principal ideal semi- 
group. But we shall prove 


LEMMA 1. The mapping x defined by (4) is the unique generator of the ideal 
€ in S. 


Proof. We shall first show that x S. If Z is any submodule of V 
then there is a composition series V = Vo’ D...>D V,’ = (0) in which Z 
occurs, say Z = V,. Thus it follows from our remarks about the Jordan- 


Hélder theorem, and since (V;_,'/Z)/(V;//Z) is isomorphic to V,_,)'/V/ 
that 


x(V/Z, t) x(Z,t) = T] o(Vi/Vi.0T] w(Vir/Vi0 = x(V0) 
1 r+1 


whence x € © 
It follows from (3) applied to the factor modules that 


T] u(Vin/V., 2) 


i=1 
divides x(V,t). Hence by (2), u(V,t) divides x(V,?t). Thus yw divides x, 
and we deduce that (x) C &. 
To prove the reverse inclusion let us suppose that y ©. Then for any 
V € B, we have 


v(V,t) = T] W(Vi-a/Vi, 2) 
i=] 


by (1). But, by (3), w(Vis/V;,t) divides ¥(Vi-1/V;,t), whence x(V, t) 
divides ¥(V, t). It follows that x divides y in T, say Wy = xv*. To show that 
¥* belongs to S we observe that for all submodules Z of V 


x(V,t) W(V,t) = W(V,0) = ¥(V/Z,t) ¥(Z, t) 
= x(V/Z,t) x(Z,t) W(V/Z,) W(Z,0 = x(V,0 W(V/Z,) W(Z, 


whence y* satisfies (1) since x(V,?¢) is non-zero. We have now proved that 
v © (x), and this implies that € C (x). It follows that (x) = ©, and we 


conclude that x is a generator of G. 











62 HANS SCHNEIDER 


To prove uniqueness, let us now suppose that € = (x) = (x*). Then 
x* = xv and also x = x*¥*. We obtain that x = xyWy*, and it follows that 
yv* = n, where n(V,?¢t) = 1, for all V € &. The mapping 7 is the identity of 
S (and also, of course, of T), and since 7 is the only inverse in T, it follows 
that y = », whence x =  x*, and the lemma is proved. 

COROLLARY. If p belongs to the ideal ©, and W(V,t) has the same degree a 
x(V,t) for all V in &, then x = y. 


Proof. Clearly y = x¥*, where, for all V in B, ¥*(V, 1?) is a monic poly- 
nomial of degree 0, and the only such polynomial is 1 


2. The definition of the characteristic polynomial. As is well known, 
the polynomials p(t) in Ft] for which p(a) V = 0 form an ideal in Fit]. By a 
slight extension of the usual terminology, we shall call the monic generator 
of this ideal the minimum polynomial of a on V, and denote it by u,(V, 
Thus gp, is a mapping of B into 9 such that w,((0), t) = 1 and which satisfies 
(2) since 


Ua(Z, A) - we(V/Z,a) VC pw (Z,a) Z = (0), 


when Z is any submodule of V. It follows, therefore, that the ideal €,, which 
consists of all y = divisible by »,, and which we shall call the characteristi: 


ideal of a in S, is a principal ideal with a unique generator. Thus we may make 
the following definition: 


Definition. Let V € &%. Then the characteristic polynomial of a on V is 
xa(V,t), where x, is the unique generator of the characteristic ideal 
GC. = wet (\S of a. 


By virtue of Lemma 1, we obtain immediately 


THEOREM 1. Let V B,and let V = VoD... D Vz, = (0) be a composition 
series for V. Then the characteristic polynomial x,(V,t) of aon V is 


(5) xe(V,t) = [] wa(Vi-n/Vi, #) - wa((0), #). 


i=1 
COROLLARY |. For each V in &, the degree of x,(V,t) equals the dimension 
of V 
Proof. \n view of (5), we need only prove that the degree of u,(Z, t) equals 
the dimension of Z, when Z is an irreducible F{a]-module. This is well-known 
if Z is faithful over Fla]. If Z is not faithful over Fla], then we must have 
aZ = 0. The dimension of Z must equal 1, and u,(Z, t) = f 


COROLLARY 2. Jf belongs to the characteristic ideal of a, and if the degree 
of W(V, t) equals the dimension of V for each V B, then = x. 


This corollary follows from the previous one and from the corollary to Lemma | 


en 
at 


CHARACTERISTIC POLYNOMIALS 6.5 


We shall now show that our definition of the characteristic polynomial 
leads to the usual one in the case of matrices. For any basis for V we obtain 
a matrix of a on V in the usual way. Since we have not assumed that a is a 
linear transformation on V, a non-zero a may have a zero matrix on |. The 
determinant of a matrix A will be denoted by |A 


THEOREM 2. /f A ts any matrix of a on V and I is the unit matrix, then 
(6) xea(V,t) = tt] —A 


Proof. Any two basis for V yield similar matrices of a on V. Hence the 
determinant |t] — A| depends on V only and not on the basis used to obtain A 
Thus we may define a mapping of T by setting y(V,¢) = |t] — A 

Let Z be any submodule of V. We choose a basis for Z and complete it to a 
basis for V. With respect to this basis the matrix of a on V’ is 


TB Bal 
p-|! B, |" 


where B, is a matrix of a on Z, and Bz is a matrix of aon V/Z 
Hence 
¥(V,t) = |t] — BI = {tl B,| \t] B. W(Z, t) W(V/Z, 1 


and so ¥ belongs to S. 


It is easily verified that p(A) = 0 is equivalent to p(a)V = 0, whence the 
minimum polynomial of the matrix A is just u,(V,¢). Thus by the classical 
Cayley-Hamilton theorem applied to the determinant |t/ — A|, the polynomial 


ua(V, t) divides y(V, t). It follows that py belongs to the characteristic ideal of a 
The degree of ¥(V, ¢) is clearly equal to the dimension of V. We may now use 
Corollary 2 to Theorem 1 to conclude that x,(V,¢t) = ¥(V,t) = |t] — A 

We may remark that it is possible to obtain an extension of some of the 
above results to the case of Fla]-modules of infinite dimension over F, in 
which case V has no composition series and y(V’, f 0) 


3. Characteristic polynomials with a common factor. [1 this section 
we shall again assume that the Fia]-module V is finite dimensional. If Z is a 
submodule of V, then u,(V, ¢) is clearly divisible by both u,(V/Z, t) and u,(Z,t) 
But the product u,(V/Z, t)-u,(Z, t) is divisible by u,(V, t). It follows by (5) 
that x,(V,¢) divides a power of u,(V,t). Thus we have proved from our 
definitions the very well-known result that every irreducible factor of x,(V, 
is also a factor of u,(V, t). This enables us to prove the next lemma 


LEMMA 2. There exists a composition series for V with the factor modul« 
appearing in any order. 


Proof. Let p(t) be any irreducible factor of x,(V,¢) and therefore also of 
Ha( V, t). Let g(t) = w,(V, t)/p(t), and let Z be the submodule of V consisting 
of all v lV’ for which g(a)v = 0. Then p(a) lV’ C Z, and so u,(V/Z, t) divides 








64 HANS SCHNEIDER 


p(t). But Z is a proper submodule of V, whence u,(V/Z, t) ¥ 1. We deduce 
that the minimum polynomial of a on V/Z and any of its irreducible sub- 
modules is p(t). Since we may start a composition series for V with a composi- 
tion series for V/Z, the lemma follows. 


LEMMA 3. Let Z be a submodule of the F\a|-module V, and let Y be a submodul 
of the F\a|-module W. Let d, be a vector space isomorphism of Y onto Z. If V/Z 
and W/Y are irreducible submodules for which u.(V/Z,t) = ue(W/Y, t), 
then there exists an extension of \, to a vector space isomorphism \ of W onto V 
such that aw— (bw) © Z, for allw © W 


Proof. The conditions on V/Z and W/Y imply that there is a vector space 
isomorphism «x of W/Y onto V/Z such that a(w + Y)* = (b(w+ Y))* for 
all w € W. Let Z’ be a subspace of V complementary to Z, and let Y’ be a 
subspace of W complementary to Y. We shall now define a mapping of W 
into V as follows. 


We let A coincide with A, on Y. If w Y’ we letv = w* be the unique element 
of Z’ for which v + Z = (w+ Y)*. We then extend A linearly from Y U Y’ 
to W = Y + Y’. Then A is an isomorphism onto V, and the Lemma follows 
since, for all w Y’, we have 

w+Z=al(w+Z) =alwt+ Y)* = (b(w+ VY))* = (bw Y)* 
= (bw)+Z 


Let &(V) be the algebra over F of linear transformations on the finite 
dimensional vector space V. If W is also a vector space over F, and \ is a 
vector space isomorphism of W onto V, then A induces an isomorphosm p 
of 2(W) onto &(V), which may be defined by b*w = (bw), for allw € W 
Conversely, let p be an isomorphism of ¢(W) onto &(V). If we consider F asa 
subalgebra of both %(W’) and &(V) in the normal way, then elements of F 
are left fixed by p. Hence it can be shown that p is induced by an isomorphism 
\ of W onto V (4, p. 237). It is this result which almost immediately vields 
the following lemma. 


LeMMA 4. Let p be an isomorphism of the algebra of linear transformations 
Y(W) onto 2(V). If b € 2(W’) andc = b*, then x,(W,t) = x.(V,?). 


Proof. Let \ be the isomorphism of the vector space W onto V which induces 
p. Then any composition series for the F[b|-module W is mapped by \j into 
a composition series for the F[c]-module V. It is easy to see that the minimum 
polynomial of b on a factor module of the first series equals the minimum 
polynomial of c on the corresponding factor module of the second series 
The lemma now follows from Theorem 1. 

This lemma is rather less trivial than may appear at first sight. For, if p 
is an isomorphism merely of the subalgebra F[b] of &(W) into &(V), and 
c = b’, then we can only conclude that u,(W,t) = u.(V,¢ 


————EE 


; 








G 
Bat 
mat 
ther 
lem: 
tion 
toge 
line: 
resp 
any 

T 
vecte 
equa 
mor, 
Fla, 

P 
Len 


lies 


Let 
also 
Dit) 


belo 


ol ¥ 


then 





wwe eee 








i 


CHARACTERISTIC POLYNOMIALS 65 

Goldhaber (2) and Goldhaber and Whaples (3) have proved that if A and 
B are square matrices with coefficients in F such that there exists a non-singular 
matrix P for which N = A — PBP-' lies in the radical of F[A, PBP-'| 
then |t] — A| = |tJ — B|. This result is generally known as Goldhaber's 
lemma. By means of the theory of canonical matrices and under the assump- 
tion that F is an infinite perfect field Osborne (5) has proved this lemma 
together with its converse. From now on we shall assume that a and b are 
linear transformations on the finite dimensional vector spaces V and W 
respectively, and we shall prove a theorem equivalent to Osborne's without 
any restriction on the field F. 


THEOREM 3. Let a and b be linear transformations on the finite dimensional 
vector spaces V and W respectively. Then the characteristic polynomial of a on V 
equals the characteristic polynomial of b on W if and only if there is an iso 


morphism p of &(W) onto L(V) such that n = a — b’ lies in the radical of 








Fla, b*] 

Proof. if p is an isomorphism of ¥(W) onto ¥&(V), and c = b’, then by 
Lemma 4 we need only prove that x,(V,‘f) = x.(V,/) when n =a—ce 
lies in the radical of 

Fla,c] = Fla, n] = Fie, nj. 
Let Z be an irreducible Fla, c|-module. Then nZ = (0), whence Z is irreducible 


also over Fia] and Fic]. Further, u,(Z,¢) = u,(Z,¢), since for a polynomial 
p(t) in Fit] the equality p(c)Z = 0 implies (p(a) + n’)Z = 0, where n’ 
belongs to the radical of Fla,c], whence p(a)Z = 0: and conversel\ 


Vea Ve)...» V, = (0) 


Let 


be a composition series for the Fla, c]-module V. It follows from the remarks 
we have just made that this series is also a composition series for V 
Fla] and F{c]-module, and that 


as all 


Ma( Vis V ;, t) = u.(V 1 V,, 0), = |, 
Hence by Theorem 1, x,(V,¢t) = x-,(V, ¢). 
Now let us suppose that x,(V,t) = x,(W,?), and let V = VD... Vv 


= (0) be a composition series for the F{a]-module V. We deduce from Lemma ? 


that there is a composition series 
W= W,)...D W, = (0) 


for the F[b]-module W such that w,(W,1/W,, t) = wa(Vi-4/V,. 0). for i 1. 
.., 8. Then using Lemma 3 we may prove by induction that there is a vector 


space isomorphism \ of W onto V which takes W, onto V, such that aw* 


(bw) V,, whenever w € W,_,(4 = 1,....,: s). Let p be the isomorphism 
of ¥(W) onto &(V) associated with A. If v = w Vin, andn=a— b. 
then 


aw — b*w* = aw’ — (bw) V 














66 HANS SCHNEIDER 


Thus for any polynomial p(t) in F[t] we have (p(a)n)*V = (0). Since V is 
faithful over Fla, b*], we conclude that n lies in the radical of Fla, b*] 


Lemma 4 allows us to state Theorem 3 symmetrically The characteristic 
polynomials x,(V,t) and x,(W,t) are equal if and only if there exist isomor- 
phisms o and r of ¥(V) and ¥(W) respectively onto an algebra of linear tran 
formations %(X) such that n = a’ b’ lies in the radical of Fla’, b*| 


Let $(Z, V) be the subalgebra of ¥( V’) consisting of all linear transformations 
d ¥(V) for which dZ C Z. We note that d 3(Z, V) if and only if Z 
is an F{d]-module. The natural homomorphism d—d’ of I(Z, V) onto 
¥(Z) is defined by d’v = dy, for all v Z. Its kernel K, consists of all d 
4(Z, V) such that dZ = (0). Clearly x(Z,t) = x,(Z,t). There is also a 
natural homomorphism of &(Z, V) onto &(V/Z), defined by d — d” where 
ad” (v + Z) = dv + Z for allv V. Its kernel Ke consists of all d 4(Z, V 
such that dV C Z. Again x,”(V/Z,t xa(V/Z, t 


LemMMA 5. Let X be a vector space of dimension r over F, and let p(t) be a 
monic polynomial of degree r in F\t|. Then p(t) divides x_(V,t) if and only ij 
there exists an F\a|-submodule Z contained in V and a homomorphism oa of 





%(Z, V) onto L(X) such that x,(X, t) p(t), where c = a° 
Proof. Let p(t) be a factor of x,(V,¢). By Lemma 2 there exists an Fla}- 
module Z contained in V for which x,(Z,/) = p(t), and by Corollary 1 to 


Theorem 1 the dimension on Z is r. Thus we may define the homomorphism 
a of 3(Z, V) onto ¥(X) to be the composed map of the natural homomorphism 
d—d’ of 3(Z, V) onto &(Z), and any isomorphism of ¥(Z) onto ¥(X 
Then putting c = a’, we have 


xe(X,t) = xa (Z,t) = xa(Z,t) = pit), 


by virtue of Lemma 4. 

Conversely, let ¢ be a homomorphism of 9(Z, V) onto ¥%(X) for which 
x-(X,t) = p(t). Using the simplicity of ¢(X) it may be shown that the 
kernel K of o is $(Z, V), K, or Ko. IE K = &(Z, V), then &(X) = (0), whence 
X = (0) and p(t) = 1. If K = Ky, then ¥&(Z) is isomorphic to 2(X) under 
an isomorphism which takes a’ onto c = a’. In this case we deduce that 
p(t) = xa(Z,t) = xa(Z,t), and x_(Z,t) divides x,(V,t). If K = Ks, then 
¥(V/Z) is isomorphic to ¥(X) under an isomorphism taking a” onto c, and so 


p(t) = xa(V/Z,t) = x.(V/Z, t), 
which again divides x,(V, ¢). 


By combining the symmetric form of Theorem 3 and Lemma 5 we im- 


mediately obtain a generalisation of Theorem 3 


ot 








CHARACTERISTIC POLYNOMIALS G7 


THEOREM 4. Let a and b be linear transformations on the vector spaces V 
and W respectively. Let X be a vector space of dimension r over F. Then the 
characteristic polynomials x,(V,t) and x,(W,t) have a common factor of 
degree r if and only if there exist an F\a|-module Z contained in V and a homo- 
morphism oa of X(Z, V) onto &(X), an F\(b|-module Y contained in W and a 
homomorphism +r of X(Y, W) onto &(X), such that n = a* — bt lies in the 
radical of F(a’, b*| 


Finally, we claim that some of the results of Goddard and Schneider (1) and 
other results on. characteristic polynomials may be derived from Theorem 4 


REFERENCES 


1. L. S. Goddard and H. Schneider, Matrices with a non-sero commutator, Proc. Cambride« 
Phil. Soc. 41 (1955), 551-553. 

2. J. K. Goldhaber, The homomorphic mapping of certain matric algebras onto rings of diagonal 
matrices, Can. J. Math. 4 (1952), 31-42 

3. J. K. Goldhaber and G. Whaples, On some matrix theorems of Frobenius and McCoy, Can 
J. Math. 4 (1953), 332-335. 

4. N. Jacobson, Lectures on abstract algebra, vol. 2 (New York, 1953) 

5. E. E. Osborne, On matrices having the same characteristic equation, Pacific |. Math. 2 (1952 
227-230. 


Queen's University, Belfast and 
Washington State College 
Pullman, Wash. 














SOME THEOREMS ABOUT 2,(n) 
MORRIS NEWMAN 


Introduction. If” isa non-negative integer, define p,(m) as the coefficient 
of x" in 


I] (1—x")’; 


n=l 


otherwise define p,(m) as 0. In a recent paper (i) the author has proved that 
if r has any of the values 2, 4, 6, 8, 10, 14, 26 and p is a prime >3 such that 
r(p + 1) = 0 (mod 24), then 


(1) p,(np + A) = (—p)?*” (2), A = r(p* — 1)/24, 


where is an arbitrary integer. 

In this note we wish to point out one or two additional facts implied by 
identity (1). The first remark is that (1) furnishes a simple, uniform proo! 
of the Ramanujan congruences for partitions modulo 5,7,11, and a general 
congruence will be proved. The second is that for the values of r indicated, 
b,(n) is zero for arbitrarily long strings of consecutive values of n. Finally, 
some additional theorems not covered by (1) will be given without proof. 

In what follows all products will be extended from 1 to ~ and all sums from 
0 to ©, unless otherwise indicated 


THEOREM |. Let r = 4,6, 8, 10, 14, 26. Let p be a prime greater than 3 such 
that r(p + 1) = 0 (mod 24), and set A = r(p? — 1)/24. Then if R = r (mod p 
and n = A (mod p), 


(2) Pr(n) = 0 (mod p). 
Proof. Set R = Op + r. Then 


> P-(n)x" = T]i — x")* = [Ja - x“) 


= [](1 — x”)°(1 — x*)’ (mod p) 
Thus 
n nm n 
a Pr(n)x = zz p{® rd b,(n)x (mod p), 
and so 


Pr(n) = > pa + )pntn —j 


Received April 3, 1956. The preparation of this paper was supported (in part) by the 


Office of Naval Research. 


(mod p), 


ai 


ye 


il 


), 


}, 


he 


SOME THEOREMS ABOUT p,(n) 69 


or 
p(n) = DL, Pelipe(n — pj) mod p). 
0g7<2 

Now (1) implies that forr > 2andn = A (mod p), p,(m — pj) = 0 (mod p 
Thus pe(n) = 0 (mod p), and so (2) is proved. 

If we now note that for R = — 1 the choices 7 = 4, p = 5;r = 6, p =7 
and r = 10, p = 11 (all with Q = — 1) are permissible, and that for these 
values A = 4,12,50 respectively, then the Ramanujan congruences p(5n + 4 
= () (mod 5), p(7n + 5) = O (mod 7), p(lin + 6) = 0 (mod 11) follow as a 


corollary, since 12 = 5 (mod 7) and 50 = 6 (mod 11 
We go on now to the second remark. We first prove the following lemma 


LEMMA 1. Let a;,ds,..., Gn+1 be non-zero pairwise relatively prime integers, 
and let c,,C2,..., Cc, be arbitrary integers. Then the simultaneous diophantine 
equations 

3X, — GaXe = C1, 
(3 GeXe — A3X3 = ¢ 
DgXn — OntiXu+1 = Cus 


always have infinitely many solutions 


TT Cui T +c,, | cin. Then by 


i 


Proof. Put T = GgsiXer1, Ci = ¢ 
summing the rows of (3) 1, 2 at a time beginning with the last, we find 


that the system (3) is equivalent to the system 


Since the a's are pairwise relatively prime, the Chinese remainder theorem 
assures us of the existence of an integer C such that 


C = — C; (mod a,j), l<qi¢y4, 
lle 
C=0 (mod a,41 
Put 7 = C + Ax, where A = ayjd2... Gy41. Then (3) has the solution 
* C, Ax 
big op Meh 4 2 ; lL<qiqgn, 
a; a, 
cS. . 
Ye+1 = \ 


+ 
On+1 An+1 
where x is arbitrary. Thus Lemma | is proved 
If we now notice, for example, that p,(mp? + p + A) = 0 (obtained by 


replacing n by np + 1 in (1)) and that any two distinct primes p are relatively 
prime, we see that Lemma | implies 











70 MORRIS NEWMAN 


THEOREM 2. For r = 2, 4, 6,8, 10, 14, 26 p,(m) vanishes for arbitrarily long 
strings of consecutive values of n, arbitrarily many in number 


1 


We remark that the same is true for p;(m), p3(7), because of the classica 
identities 


I] (l— x") = > (—1)"x2 
I] i - 2)’ = a (—1)"(2n + 1)x?", 


due respectively to Euler and Jacobi 
Finally, we state without proof some additional identities derivable in the 
same way that (1) was derived in (1); p is a prime in what follows. 


(4) p( np + oo(b — .)) = (-1)***' »(*) , p#1(mod 12), p > 3. 

(5) pe(3n + 2) = Dpe(4n). 

(6) ps(2n + 1) = — 8ps(4n) (due to van der Pol (2)) 

ms 2 > P n ~ 

(7) bud np + —(p — .)) =p pu? ), pb =7 (mod 12). 
12 p 


(8) pul np + igh — .)) =—p n(*) , p = 5 (mod 12). 


REFERENCES 
1. M. Newman, An identity for the coefficients of certain modular forms. J. Lond. Math. Soc 
30 (1955), 488-495 
2. B. van der Pol, The representation of numbers as sums of eight, sixteen, and twenty-fou 


squares, Proc. Kon. Nederl. Akad. Wetensch. Ser. A 57 = Indagationes Math. /4 
(1954), 349-361 


National Bureau of Standards, 
Washington, D.C. 


Pi 





} 
e 
3. 
)) 
) 
). 
). 
' 
J 
; 
ur 
16 





CLASSES OF POSITIVE DEFINITE 
UNIMODULAR CIRCULANTS 


MORRIS NEWMAN AND OLGA TAUSSKY 


All matrices considered here have rational integral elements. In particular 
some circulants of this nature are investigated. An ” X n circulant is of the 


form 
td tj Ca—i 
C Cai Ce Ca. 
C1 ( Co 


lhe following result concerning positive definite unimodular circulants was 
obtained recently (3; 4): 


Let C be a unimodular n Xn circulant and assume that ( AA’, wher 
A isann Xn matrix and A’ its transpose. Then it follows that C= C,C,’ wher 
C; ts again a circulant 


For general unimodular matrices the assumption C = AA’ is stronger than 
symmetry and positive definiteness if and only if » > 8, as was shown by 
Minkowski (1). The question therefore arises whether symmetry and positive 
definiteness suffice even for n > 8 in the theorem above; or in other words, 
whether a unimodular symmetric positive definite circulant is necessarily) 
of the form AA’. (In this connection it was shown by I. Schoenberg (in a 
written communication) that a hermitian positive definite circulant with 
arbitrary complex elements is always of the form AA’ where A is again a 
circulant) 


It will be shown that the circulant MW whose first row is 
(2,1,0, —1, 1, -1,0, 1 
is positive definite, unimodular, but not of the form AA’ 


Mordell (2) showed that every symmetric positive definite unimodular 
8 X 8 matrix which is not of the form AA’ is congruent to the matrix K which 


corresponds to the quadratic form 


> xy + (> z.) = 2X 1X2 2XoXs. 
I ix1l 


rhe circulant ./ therefore is congruent to AK. 


Received April 12, 1956 lhe preparation of this paper was sponsored (in part) by the 
Office of Naval Research 


71 








~I 
iw 


MORRIS NEWMAN AND OLGA TAUSSKY 


THEOREM |. The circulant M is not of the form AA’ 


Proof. Any matrix of the form AA’ corresponds to a quadratic form which 
represents all integers if m > 4, but certainly represents both odd and even 
integers for any n. The quadratic form corresponding to M, however, represents 
only even integers. This proves the theorem. 

That M is positive definite can be verified directly. It is no more difficult 
to characterize all positive definite symmetric unimodular 8 X 8 circulants. 
This is done in the following lemma. 


LEMMA |. Any circulant C whose first row is (do, ay, , @7) ts unimodular, 
symmetric, and positive definite if and only if 


Q@ = 3(1+ x), a; =a; = hy, a2 =a, = 0, 
ad3 = a5 = sy, a= s(l — x) 
where x > 0 and x* — 2y* = 1. (The circulant M arises from x = 3, y = 2 
Proof. Any circulant C with first row (do, a), ...,@7) has the eight char- 


acteristic roots 


a;y,= } ag 


where ¢ runs through the eight roots of x° | = 0. The circulant C is uni- 
modular and positive definite if the algebraic integers a, are real positive 
units. From this it follows that C is unimodular, symmetric, and positive 
definite if and only if 


(1) ao + 2a, + 2a. + 2a; +a, = 1 (¢ = 1 

(2) Qy — 2a; + 2a. — 2a; + ay = 1 (¢(=—] 

(3) a — 2a +a, =I! (¢? = — 1) 
(4) ay — ay + (ay a3)({ — ¢*) = e (¢* = — I), 
(5) ay — ay — (a, — a3)(F — £%) = € (f= — 1 


where €;, €2 are real and positive units. 


The equations (1), (2), (3) imply that a, = 0, do + a4 = 1, ai) + a; = OU. 
Introducing these relations and ¢ — ¢* = + /2 for ¢* = — 1 into (4) and 
(5) we obtain 

Zao — 1 + 2a; V2 = € 
2ay — 1 — 2a; V2 = €2 
Hence 
(6 (Zag — 1)? — 8a,” = €€2. 
Since the left side of (6) is rational it follows that e;¢2 = 1. Putting 2a) — | = 4 


and 2a, = y the assertion follows. Since the general solution of x? 2y? | 








is 












ar 


ar- 





? 





POSITIVE DEFENITE UNIMODULAR CIRCULANTS 


is given by 


e— Y2y = (8 — 2 72)? = (1 — V2)”, 
we find that 
r—- f2Zy= 3? — 2p 3?! 2 
m (—1)? — 2p9(—1)""' Y2 mod 4 


rhus y is always even, and 


l+x 1+ (—1)’ 
> ~ 9 


~ - 


(mod 2), 


i.e., dy is even when p is odd and odd when p is even. Thus the circulants 
derived from a solution with an even p are congruent to the identity, while 
those derived from a solution with an odd p are congruent to K 


\s the referee pointed out, the two classes of circulants can also be obtained 
from the fact that every positive definite unimodular 8 X 8 circulant C is a 
power of M. For, every power M" is certainly such a circulant. Conversely, 
the proof of Lemma | shows that there is exactly one such circulant whose 
characteristic roots are given powers of the characteristic roots of M 

If then » is even, we have M* = M™.M'" ~T and for n odd we hav 


M = Mic-» MMi ~ M. 


REFERENCES 


1. H. Minkowski, Grundlagen fiir eine Theorie der quadratischen Formen mit ganssahlige 
Koeffisienten, Gesammelte Abhandlungen / (1911), 3-144 

2. L. J. Mordell, The definite quadratic forms in 8 variables with determinant unity, J}. de Math 
pures et appliquées, 17 (1938), 41-46 

3. M. Newman and O. Taussky, On a generalisation of the normal basis in abelian algebrain 
number fields, Comm. on Pure and Applied Math. 9 (1956), 85-91 

4. O. Taussky, Unimodular integral circulants, Math. Zeitschr. 63 (1955), 286-289 


Vational Bureau of Standards, 
Washington 25, D.C. 








SIMULTANEOUS PAIRS OF LINEAR AND QUADRATIC 
EQUATIONS IN A GALOIS FIELD 


ECKFORD COHEN 


1. Introduction. Let F denote the Galois field GF(p’) with p’ elements, 
where p is an odd prime and r is a positive integer. Suppose further that m 
and m are arbitrary elements of F and that a,, 8; (i = 1,..., 5) are nonzero 
elements of F. The purpose of this paper is to evaluate the function N,(m, n), 
defined, for an arbitrary positive integer s, to be the number of simultaneous 
solutions in F of the equations 


(14 jm = ax)" + tT aX 
(n= Bixi +. + B,X;. 


Explicit formulas for N,(m,n) are obtained in Theorem |, and on the basis 
of this theorem, it is easy to establish the solvability criterion contained in 
Theorem 2. It follows from the latter criterion that the least value of s for which 
(1.1) is always solvable is the value s = 4. We mention that Theorem 1, 
in the special case r = | (that is, in the case of rational congruences (mod p 

reduces to a result of O’Connor and Pall (3; 4) proved by a different method 

It is of interest to compare Dickson's formulas (2, §§64-67) for the number 
of solutions N,(m) of the first equation in (1.1) alone, with the results for 
N.(m, n) obtained in this paper. As it might be expected, the results for the 
simultaneous problem are somewhat more involved. A significant difference 
between the results for the two problems arises from the fact that N,(m) > 0 
for all s > 2. 

In this paper we use a direct method based on the trigonometric expansion 
of N.(m,n). The most that will be required is a double application of the 
generalized Cauchy-Gauss sum, (1.7) and (1.11) below. 

Next we introduce some notation that will be needed in §2 and §3. Let 
‘(a) denote the trace of an element a in F, 


tia) =at+a’+...+a 


hen we place 

(1.2) tin<c” 

from which it follows that e(a + 6) = e(a) e(6). The symbol >°, will be used to 
indicate a sum over the totality of elements of F, while }0,.9 will denote a 


sum over the nonzero elements of F. One will note the property, 


‘ , yp’, a= 0, 
(1.3) E(a) = > e(ax) = lo. er 


Received June 4, 1956. 


~—- 


< 


Zz 





EQUATIONS IN A GALOIS FIELD 79 
which may be restated in the form, 
) — |, a = VU), 
(1.4) cla) = Ss e(ax) = ‘gl 
r0 —], a # WV. 


The symbol (a) will be used to denote the Legendre symbol in F, that is, 
¥(a) = 1, —1, or O according as a is a nonzero square, a non-square, or is 
zero in F. We denote the quadratic Gauss sums in F by 


> e(ax’), 


(1.6) G*(a) = > v(x) e(ax). 


re0 


(1.5) G(a) 


The less familiar Cauchy-Gauss sum is defined for F by 


(1.7) S(a, 6) = > e(ax” + 2bx) 


We mention the following well-known properties of G(a) and G*(a 


1.8 G(a) = ¥(a) G(1), a ~ 0, 
(1.9 G2(1) = ¥(—1) p’, 
. {G(a), ax, 
()) 7* = 
1.1 G* (a) 10. epee 


rhe sum S(a, 6) has the reduction property (1, $6), 


e(—b?/a) Gla), a # OQ, 
(1.11 S(a, 6) = >” az=b=090 
lo. a=0.6+0 


2. The evaluation of N.(m, n). We shall need the following additional 


notation, 


(2.1) a@ = @... Qs, 
By B, 
(2.2) B= ii ae > 
ay a 
(2.3) y=n — Bm. 
The results of this section can be stated most conveniently in terms olf th« 


five following cases arising from conditions satished by m, n, 8, and ¥ 
Case |: B= 0, n # 0, 
Casell: B=n=0, m # 0, 
Case III: 8B 0, 
Case IV: B#0, 70, 
Case V: BF 


ll 
I 
Il 


We now prove 


THEOREM 1. The number of solutions N.(m,n) of (1.1) ts given by 











76 ECKFORD COHEN 


|p“ 2) + gps W(a) ¢, s = 4p 
(2.4) N.(m.n2) =? +P? ¥@)s, s = 4k +1, 
p*’ + p™ ¥(—a) f, s=4k4+2 
— + ~~ ¥(—a) n, s= 1k + 3, 
where » and § are defined by » = 0, ¢ = 0 in Case |; » = p’(m), ¢ = — 1 
f y 
in Case I]; y = 0,¢ = p’ — 1 in Case III; 9 = — ¥(8), ¢ = W(y) in Case IV; 


n = (p’ — 1) ¥(8), ¢ = 0 im Case V. 


Remark. It is to be understood that N,(m, mn) is undefined for any cases 
that may be incompatible. 


Proof. The function N,(m, n) has the double Fourier expansion (5), 
(2.5) N,(m,n) = at ® > A(u, v) e(—mu) e(—2nv), 


A(u,v) = , 2 e( u(auxs” - asxs')) e(20(8.x1 +...+ B.x,) ) 


ZI Zs 

We break up this expansion into two parts according as u = 0 or u + 0, 
to get 
(2.6) N.(m,n) = 14+ d2, 
wher« 
(2.7) - ad af ® e(—2nv) [] &(26 w), 

e i=1 
(2.8) > =p ae 0 > e(—mu) e(—2nv) [] S(am, Bw). 

uO c i=1 


By (1.3) we have immediately 


(2.9) > 1 =P 
Now by (1.8) and (1.11) one obtains for u # 0, 
—6," 


au 


S(am, Bw) = | ) V(au) G(1), 


so that (2.8) becomes, using the definition of 2, 


(2.10) > = G'(1)p” v(a) >> ¥'(u) e(—mu) S(—B/u, —n). 


uO 


If wu ~ 0, we have, again by (1.8) and (1.11 


( e(m*a 8) ¥(—Bu) G(1), BH VU, 
(2.11 S(—B/u, —n) = +p’, B=n = 0, 
lo. 8B=0,n #0 


We now evaluate >>» in the separate cases arising from (2.11). It follows 
immediately from (2.10) that 


(2.12) >: = U, g8=0,n #0 








A Sg SS 

















KQUATIONS IN A GALOIS FIELD 77 


In case 8 = n = O, we obtain from (2.10) and (2.11), 


. [G°C1) B Wa) c(—m), 8 = n = 0,5 even, 
— D |G*(1) pb” (a) G*(—™m), 8 = n = 0,5 odd. 
Applying (1.4), (1.9), and (1.10) to (2.13), it follows, in case s is even, that 

“* ((-1)"a) rn 7 B=n=0,m #0, s even, 
(2.14) >= 

lv (—1)"a) gir (p’ — 1), 8 =m =n = 0,5 even, 
and in case s is odd, 

: (¥((- ‘tie am) —, 8 =n =0,m # 0, s odd, 
aa ds i 10, B=m =n = 0,5 odd. 


In case 8B ~ 0, it follows from (2.10) and (2.11) that 


(a1) p *r Y(—aB) G*(y/B), 8 # 0, s even, 
2.16) 2s= la***(1) p-*” ¥(—aB) ¢(y/8), B # 0, s odd 
Applying (1.4), (1.9), and (1.10) to (2.16), we obtain, in case s is even, 
[¥((—1)8*Pay) pio, 8 8 0, 7 #0, seven, 
(2.17) >= 


lo, B #0, 7 = 0, s even, 


and in case s is odd, 


|-¥ ((- poe ap) "el 8 #0, y ¥ 0, s odd, 
(2.18) oe = . 

rote girte-® (a7 — 3), 8 ¥ 0,7 = 0,5 0dd. 
Combining (2.6), (2.9), (2.12), (2.14), (2.15), (2.17), and (2.18) the theorem 


follows. 


3. Solvability criterion. We now apply Theorem | to the cases s < 4 
to obtain the following explicit results. 


3} Ni ( ie jl, Case V, 

wads ee ae Case IV; 

+ Cases I, V, 

39 Ny. 0, Case II, 
(3.2) N.(m, = . 

_ p’, Case III, 


1 + ¥(-aey), Case IV; 











78 ECKFORD COHEN 


?’, Cases I, III, 
— pb’ + p'v(—am), Case I], 
as N3(m, = ‘ 
(3.3) 3(m, n) ’ — ¥(—a8), Case IV. 
p’ + (p’ — 1) W(—a8), Case \ 


’ ale Cases I, V, 
p*" 


rp? Case II, 
(3.4) N.(m,n) = 42. p’ ¥{a), Case i 
p*’ + p’(p’ — 1) W(a), Case II], 

p?’ + p’ vlay), Case I\ 

It is noted that Cases I, I], and III do not arise if s = 1 or if s = 2 and 


¥(-—a) = — 1. 
On the basis of (3.1), (3.2), (3.3), and (3.4) we obtain immediately the 
following solvability criterion 


THEOREM 2. Subject to the restrictions stated in the Introduction, (1.1) 1 
always solvable (N,(m,n) > 0) provided s > 4. The only cases in which (1.1 


is insolvable, that is when N,(m,n) = 0, are the following: 
(1) s=1l1, 7 #0, 

(2) s=2, B#0, y¥ #90, (-—ay) = —-1, 

(3) s=2, B=n=0, m #0, 

(4 s=3, B=n=0, m0, (-—am) = — 1, 


where a, B, and y are defined as in §2 


REFERENCES 

1. Leonard Carlitz, Weighted quadratic partitions over a finite field, Can. J. Math., 5 (1953 
317-323. 

2. L. E. Dickson, Linear Groups (Leipzig, 1901). 

3. R. E. O'Conner, Quadratic and linear congruence, Bull. Amer. Math. Soc., 44 (1939 
792-798 

4. R. E. O’Connor and Gordon Pall, The quaternion congruence tat =b (mod g), Amer. | 
Math., 67 (1939), 487-508. 

5. A. L. Whiteman, Finite Fourier series and equations in finite fields, Trans. Amer. Math 
Soc., 54 (1953), 78-98. 


University of Tennessee 


I, 


i, 


th 


THE SET OF ALL GENERALIZED LIMITS 
OF BOUNDED SEQUENCES 


MEYER JERISON 


1. Introduction. Let M be the normed linear space whose general 
element, x, is a bounded sequence 


En} nat 
of real numbers, and ||x|| = L.u.b. |&,|. Let 7 denote the linear operation (of 


norm 1) defined by Tx = (2, &3,..., Enai,--.). A generalized limit is a linear 
functional @ on M which satisfies the conditions 


(1) x > O (i.e., & > O for all 2) implies ¢(x) > 0: 
(2) (Tx) = d(x) for all x M: 
(3) Mi, 1,1,...) = hi. 


The set of all generalized limits will be denoted by L. In the presence of (1), 
condition (3) is equivalent to ||¢|| = 1. 

The basic question of existence of generalized limits has been settled in a 
variety of ways; the standard proof appears in (2, p. 34). Tis proof, based 
upon the Hahn-Banach theorem, actually leads to all generalized limits, and 
this fact was used in (7) to obtain properties of L. In the present paper, 
attention is focused on another existence proof (5, p. 1010; 8; 10, p. 52) 
which depends, ultimately, on Tychonoff’s theorem. In order to describe 
this proof, we must summarize some well-known properties of \/ and of the 
conjugate space M* (1; 5). 

Only one topology in M* will be of interest to us, namely, the weak* 
topology, which is defined as follows: A directed system {¢,} in M* converges 
to @ if ¢,(x) — o(x) for each x © M. An essential property of this topology 
is that the set 

B* = {o\\o(x)| < |\x!| for all x M} 


(the unit ball) is compact. B* is also convex, and we are able to apply the 
Krein-Milman theorem to its subsets. Thus, if K is a closed convex subset of 
B*, and S (CK) contains all extreme points of K, then K is the closed convex 
hull of S, denoted by §(S). In particular, if K is not empty, neither is S 

We will denote by ’ the set of extreme points of B* that satisfy condition 
(1), or equivalently, the collection of extreme points of the subset of B* 
which is determined by conditions (1) and (3). Since the latter set is closed and 
convex, it is, in fact, 6(2’). Among the functionals in 9’ are those of the form 
§,(x) = & for each fixed natural number p. The collection N of all such 





Received October 26, 1953; in revised form October 21, 1955. This work was supported 
part, bv a grant from the National Science Foundation (USA 











80 MEYER JERISON 


functionals is a discrete, open, and dense subset of Q’. Since Q’ happens to 
by compact (1, p. 504), it is the closure of N in M*, and it is known as the 
Stone-Cech compactification of N. The set Q@, the complement of N in ©, 
is characterized, as a subset of 2’, by the fact that each of its elements satisfies 
the condition 


(4) (x) is independent of the value of &, for each fixed n. 


Its closed convex hull, 9(Q), is the set of all functionals in M* satisfying 
conditions (1), (3), and (4). Since (2) obviously implies (4), LC §(Q) 
It is important to note that L is itself a closed convex set. 

The proof of existence of genera’ized limits that was referred to in the second 
paragraph goes as follows: Let x € Q, or, more generally, x © §(Q); a general- 
ized limit, ¥, is obtained by setting 


im, | 
v(x) = (tn > ef) 


for x = {& | in M. Now, a functional x € §(Q) has the property that if 
y € M is a convergent sequence, then x(y) is the ordinary limit of y. If x 
is a sequence whose arithmetic means, n~' >> £,, converge to o, say, then 
v(x) = o for every generalized limit ¥ which is obtained in this manner 
Since it is known that there are some such sequences and some generalized 
limits which assign to them values different from o (see Theorem 4), it follows 
that this procedure does not lead to all generalized limits. It is our intention, 
therefore, to modify this procedure in such a way as to obtain all generalized 
limits. 
Let 7, be the operator 


> > a 


and let 6,(x) = & for all x € M. Then {6,(7,x)} is the sequence of arithmeti 
means of x, and the generalized limit obtained above may be defined as 
v(x) = x({@:(7,x)}). From this point of view, an obvious way to generate 


more generalized limits is to replace @,; by some other functional. Three 
observations should be made in this connection. 

(a) If 0, is replaced by 6, € N, nothing new is obtained, because 

6,(T'x) = 0,(7?"*x), i= 1,2,..., 
so that 
an an w@nt —— | 
x({0,(7,x)}) = x({0:(7,T" x)}) = ¥(T? x) = p(x). 

(b) If @ is any functional satisfying conditions (1) and (3), i.e., @ H(0’), 

and if x € (Q), then 


(5) wee) = x({o(7,x)}) 


does define a generalized limit (Theorem 1). 


5 to 
the 
0’, 


fies 


ing 
(Q2) 


ond 


ral- 


GENERALIZED LIMITS OF BOUNDED SEQUENCES 81 


(c) It is trivial that any generalized limit y may be obtained from (5 
simply by taking @ = y, because ¥(7,x) = W(x) for all n. 

The purpose of this paper is to prove that the collection Q of all functionals 
of the form (5) with x and ¢ in Q is sufficient to vield all generalized limits 
in the sense that 6(Q) = L. 

The proof requires a surprising amount of heavy machinery: the repre- 
sentation of M as the space of all continuous functions on 0’, the representation 
of an arbitrary continuous linear functional on M as a measure on 0’, and one 
of the deep theorems of measure theory, the individual ergodic theorem 
\ similar (but apparently weaker) result can be obtained by using entirely 
different techniques. As was stated before, the set N is dense in 2’, which means 
that the element x © Q is the limit of a directed system 


of elements of N. In terms of this directed system, (5) becomes 
v(x) = lim, $(7,, x). 


Since x Q2,n,— o. Let us consider, now, the set A of all continuous linear 


al, 


functionals of the form 


(6) v(x) = lim, ¢,(7,, x) 
where n, ~ ©, ¢, © Q for all v, and the limit is assumed to exist for all x V/ 
It will be seen that A is a closed subset of L, and the proposition §(A) = L 


will be proved independently of the obvious fact that QC A. I have not 
been able to determine! whether Q is a proper subset of A or whether the 
closure of Q is all of A 


2. Limit points of sequences in M*. Since it is convenient to work in 
the space M* as much as possible, we introduce the operator 7* on M*, 
defined by T* $(x) = $(Tx) for all x € M, @ € M*. Condition (2) becomes 
(2*) T*d = o. In keeping with the notation used above, we have 


THEOREM 1. // 
y = lim, 7,* @,, 
where each $, is a positive linear functional of norm | (i.e., ¢, satisfies conditions 
(1) and (3)) and n, — @, then y i 


Proof. It is clear that’ ¥ satisfies conditions (1) and (3). For any x V, 


‘l am grateful to the referee for many suggestions for the improvement of an earlier version 
of this paper. Most important, by far, was his discovery of an error in what purported to be a 
proof that Q=A 











82 MEYER JERISON 


¥(x — Tx) = lim,|T,, * (x) — T,, * ¢,(Tx)] 


n,—-1 
= lim, oJ, n,* >> (Tx — Tx) 
= lim, ¢,[n,-(x — Tx). 
But 
o,[n, (x — T"'x)] |<2||¢,|| - n,  IIxl| 0 
with v. Therefore, p(x Tx) =0 


The existence of generalized limits is an immediate consequence of Theorem 
1; this is substantially the same proof that was described in the Introduction 
Take any 6 € N. Since {7,,*@} C B* and the latier set is compact, the former 


set must have limit points all of which are in L, according to Theorem | 


We observe that {|7,,*@} is not a convergent sequence in M* for any 6 © N, 
because convergence of {7,,*@} in M* implies convergence of the sequence 
of numbers {| 7,,*@(x)} for every x M. But, if 6 = @,, then 

T.,*6,(x) = 0,(T,x 


is the mth arithmetic mean of the sequence 7’~' x, and it is easy to find an 4 
which will make {6,(7,x«)}| diverge. What is not so obvious is that if w © Q, 
|7T,,*w} need not converge either. Here is an example. 


First, construct a sequence |;} whose arithmetic means do not converge 


Thus: m = 1,9; = Oif 2 <i < 2* + 2", and n; = 1 if 

P+ FP it < F™, k 2 
Next, let 

16,,}(k = 1,2 ) 
be a sequence in N such that nm, — m_, > 2*, and let w be one of its limit 
points. Define £, = 0 if m = n, for some k or n n,, and &, = n; if 7 is th 
least positive integer such that = n, + 7 for some k. The inequality serves 
to guarantee that, for each 1, 

Santi = 

for all but a finite number of k's. Setting x = {£,}, we have 


w(x) = lim,,., én, 


provided the limit exists, and more generally, 


w(T'x) = lime. Eap+¢- 


It follows that for each 1, w(7'x) = n;, and therefore, 


j n—1l | 
\n p> w(T'x) ( 


does not converge. But this implies that {7.,*w} does not converge in M* 


{ 
’ 
’ 
t 
- 


| 








GENERALIZED LIMITS OF BOUNDED SEQUENCES 


The sequence {m,} is an example of a sequence of integers of density 0 
in the number theoretic sense. Making use of the obvious one to one corre- 
spondence between N and the set of positive integers, we may speak, in the 
same way, of non-dense subsets of N. It is somewhat surprising, in view of the 


fact that N is a countable, dense, discrete set in Q’, that there exist points w 
in 2 which are not limit points of any non-dense subset of N. | conjecture 
that for some such w the sequence {|7,*w} does converge, but | have not been 
able to prove this. It will be seen in the proof of Theorem 3 that for each 
v © M there exists w © 2 with the property that {7,,* w(x)! is a convergent 
sequence of numbers. 


3. Sets that generate L. We have already observed that LC §(Q) 
Since 7,,* is a continuous mapping and Q is compact, the set A, = 7,*(Q) is 
compact. From the additional fact that 7,,* is linear, it follows that for an) 
set SC B*, 

T,*(9(S)) = 9(T,*(S)). 


Since L is elementwise invariant under 7),*, we have 


| 


L = T,*(L) C T,*(9(Q)) §(T,*(Q2)) = H(A, 


It follows that 
Ee ¢} eA. 
1 


This conclusion is obtained so easily because the closed convex hull of a set is, 
in general, very much bigger than the set itself. We would like to have L as 
the closed convex hull of an intersection rather than the intersection of hulls 
It is hopeless, however, to expect that 


L= o( A A,) ; 


because (\ A, is empty in the worst possible way; namely, the sets A, ar 


mutually disjoint. To prove the last statement, fix m, and let x = {£,} where 
t, = 1 if 7 is a multiple of m and £, = 0 otherwise. Then for every w © 1’, 
w(x) is either 0 or 1, there is some integer 7’ such that 7*” w(x) = |, and 
T* w(x) = 1 if 7 = 7’ (mod n) and T* w(x) = 0 if 7 # 7’ (mod n). Conse 
quently, @(x) = 1/n for all @ © A,, whereas, if @ © A, with k <n, then 
(x) = 0 or 1/k. Thus, A; (\ A, is empty for all k < m and for all n 


lhe operation on the sequence of sets {A,} that does the job we want is 
the topological limit superior. By definition, 


y © lim sup A, 


itevery neighborhood of ¥ meets infinitely many of the sets A,. Equivalently 


lim sup A, = () F,, 
7 1 


m 








84 MEYER |JERISON 


where F,, is the closure of the set { 

i) &.. | 

n=m 
It is clear from the form of the sets A, that if y € lim sup A,, then there are } 
directed systems 

{7,,*} and {a,} ( 
directed by the same set |v} and with w, € Q, such that 2, — @ and 
Tn, *@» a y. 

Conversely, every cluster point of such a directed system is in lim sup A, 
Thus, lim sup A, is precisely the set A of the introduction. By Theorem 1, 
A C L, and since L is compact and convex, we conclude that (A) C L 


THEOREM 2. (A) = L. 


Proof. in view of the preceding discussion, we have only to prove that } 
LC (A). But this follows easily from Theorem 2 of (4), which asserts that 


for a sequence | §(A,)} of compact convex sets, 
( 
lim sup §(A,) C (lim sup A,). 
We have already seen that 
LC f\ S(Az¢), t 


7 


which, in turn, is contained in lim sup 9(A,). ( 
COROLLARY 1. All of the extreme points of L are in A. 


Proof. The topological limit superior is always a closed set. According to a 
theorem of Milman (9; 3, p. 84, prop. 4), all of the extreme points of the 
closed convex hull of a set are in the closure of that set. 


COROLLARY 2. i I 
L= ()\ T,*(9(Q)). 
n=1 


This can be proved directly, but at this stage it comes very easily out of a 
string of inequalities: 
L= (\T,*(L) CQ T,*(9(@)) = NM H(A,) 
n=1 


C lim sup $(A,) C O(lim sup A,) = L. 


We turn now to the development of a sharper result. It is well known that 


—, 


M is isomorphic and isometric (equivalent in Banach’s sense) with the space 
of all continuous functions on the compact set 2’, the isomorphism taking the 


element x M into the function whose value at w © 9 is w(x). It is convenient 





ire 


lat 
lat 


the 


hat 
ace 
the 


ent 


GENERALIZED LIMITS OF BOUNDED SEQUENCES 85 
to identify the two spaces and use the notation x(w) to denote w(x). To each 


continuous linear functional @ on M, there corresponds (5) a unique regular 
Borel measure m on with the property 


(x) = j x(w)dm(w) for ail x M. 


Conditions (1)—(4) on the functional @ may be translated into the following 
conditions on the measure m: For every Borel set I, 

(1') m(T) > 0; 

(2’) m(T*T) = m(T); 

(3’) m(Q’) = 1; 

(4’) m(N) = 0. 

THEOREM 3. Jf Q is the set of all limit points of sequences {T,*w}, w © Q, 
then $(Q) = L. 

Proof. We apply here the technique of Milman, as described in (4). Since 
L is a compact, convex set and QC L, it suffices to prove that for every 
continuous (in the weak* topology) linear functional f on M*, 


supyer [(W) = supyeo /(¥). 


But every continuous linear functional on W/* comes from an element ol A/; 
that is, given such an f, there exists x M such that f(@) = (x) for all 
@ © M*. We wish to prove, then, that for all x € M, 


(7) SUPyer V(X) = SUPyeg W(X). 


Fix x M. Since L is compact, there is an element Wo in L such that 
Yo(x) = supper V(X). 


Let my be the regular Borel measure on 2’ corresponding to the functional Yo 
Conditions (1’)—(4’) are satisfied by mp, so that all of the measure is carried 
by the set 2: mo(Q) = 1. The individual ergodic theorem (6) is applicable to 
this situation, and we conclude that lim,.,.. 7, x(w) exists for all w in an in 
variant (under 7*) subset A of 2 of mo—measure 1. If we let X(w) denote this 
limit for w A, and X(w) = 0 for w € 2 — A, then the ergodic theorem 


states further that X is a measurable function, invariant under 7*, and 
(8) | X (w) dimo(w) = J x(w) dm = Wo(x). 
. e 


For each w A, the set’| 7,,*w} has limit points, and according to Theorem 1, 
every such limit point is in L. Consequently, there is a y in L, which depends 
upon w and such that 


lim,..., J» x(w) = W(x), 











86 MEYER JERISON 


Since ¥(x) < Wo(x) for all y < L, we have 
X(w) = lim 7,x(w) < Wo(x) 


lor all w © A. This inequality together with equation (8) and the fact that 
my(A) = | implies that X(w) = Wo(x) for almost all w A. In particular, 
there is at least one wy for which the equality holds, and any limit point x of 
| T,,*wo} has the properties: 


x Q and x(x) = Po(x) = supyer (x). 


This verifies (7) and completes the proof of the theorem (cf. 6a). 


4. Alrnost convergent sequences. Lorentz (7) calls a sequence x M 
ulmost convergent if ¥(x) is independent of y © L. By Theorem 3, it is obviously 
sufficient to require that ¥(x) be constant for y © Q. Observing that every 
¥ < Q has the form 


¥(x) = lm, 7,, x(@), @ 


leads to the following characterization of almost convergent sequences: 


LEMMA. In order that there exist a number o such that (x) = o for alli 
¥ L, it is necessary and sufficient that 
(9) lim,.., J »X(w) = o 
forallw © YY. 


(We write 2’ rather than © in order to facilitate the coming discussion. Both 
are correct. 
The known characterization of almost convergent sequences is the following 


(7: see also 10, p. 53): 


lHEOREM 4. A necessary and sufficient condition for the existence of o such 
that (x) = o for all » © L ts that 


Is, 
>or >, bese = Os 
7 tax} 


(10) lim, 


uniformly in k 
In our notation, (10) takes the form 
(10) lim, J»x(k) = oa, 


uniformly in k( ¢ N). Since N is dense in 9’ and 7,x is a continuous functio! 
on 2’, (10’) obviously implies (9). But the converse is also true. For (9) implies 
weak convergence of the sequence {7.x} in M to the constant o. By the mean 
ergodic theorem in Banach space (6), this implies convergence in norm, i.¢ 
uniform convergence on all of ’. 


— 


all 


th 


ing 


uch 





/ 


GENERALIZED LIMITS OF BOUNDED SEQUENCES 87 


5. The maximal generalized limit of a given sequence. Ii p is a 
functional on M satisfying the two conditions 

(1) p(x + y) < p(x) + p(y), 

(ii) p(Ax) = Ap(x) for all x,y € M and A > 0, 


then, according to the Hahn-Banach theorem, there exists a linear functional 
@ such that $(x) < p(x) for all x. The functional + defined by 


T(x) = supyer V(X) 


satishes conditions (i) and (ii). Noting that for a linear functional @, (x) <r(a 
for all x € M implies 


<< ? | x = @\X) j T\X), 


one sees that every such ¢ is actually a generalized limit 

It is clear from the proof of the Hahn-Banach theorem (2, p. 28) that not 
only is there a linear functional @ dominated by p, but also, if x» © M is 
given, @ may be chosen so that $(x 9) = p(xo). Consequently, if p has the 
property that @ < pimplies@ © L, then p(x) < r(x) forallx € M. In Banach’'s 
proof of the existence of generalized limits the functional which is used for 


p is 
, 


. . 1 
r(x) = inf lim sup” Eetmis 
kser i=1 


where the infimum is taken over all possible choices of non-negative integers 
m,,..., m,. By what was said earlier, r’(x) < r(x), and it is easy to reverse 
this inequality to obtain r’(x) = r(x). All of this has been observed before 
(7) 

Now, let us use Theorem 2 to calculate r(x) from the terms of the sequence 


1& |. Since L = §(A), where A = lim sup A,, we have 


T(x) = supyea W(X). 


By definition, 
where F,, is the closure of 


It is easy to see that 


n—l 
. i 
sup $(x) = lim sup p » n+, 
_— i=0 


OtAn 











88 MEYER JERISON 


so that 
" 1 


. 1 
sup $(x) = sup lim sup” ) 2 bn+ ty 
OeF m ksex 


a>m i= 


and finally, 


n—! 
(11) sup ¥(x) < inf sup lim sup” oy 
veA k» 0 


= se 
We denote the right side of the inequality by r’’ (x): 


n—1 
” . . 1 - 
r(x) = lim sup lim sup” » Ex 


0 
" ker 


Since F,, is a compact set, the supremum of ¢(x) is attained at some ¢, © Fy, 
for each m. The sequence {¢,,} has limit points and every one of these limit 
points is in A. It follows that the inequality in (11) is actually an equality 


THEOREM 5. For every x = {&,} in M, 


n—l 
” . . 1 
SUPyen W(X) = 7’(x) = 7’ (x) = lim lim supa” - En+ i 
sa koa 9 


n 


Everything has already been proved excepting the existence of the last 
limit, which is obtained by observing that 


n—1 
. . . . l ” 
r(x) < lim inf lim sup n > fii ST (x). 
n kon fanf) 


\n alternate proof of the equality 7’’ = 7 can be given by using the Hahn- 
Banach argument mentioned earlier; namely, if @ is a linear functional such 


that (x) < 7r’’(x) for all x, then @ © L. Consequently, r’’ < 7, and a simple 


, 


computation shows that y € L implies y < 7”. 


Theorem 2, although more complicated than this one, has the advantage 


The argument based upon 


that it enables one to discover the form of the functional 7’’. In fact, the 
equality 7’ = r’’ does not seem to have been noticed before. 

It ought to be possible to prove that 7’ = r” directly from the expressions 
for r’ and r”’ in terms of the sequence {£,}, and this was done by Professor 
J. H. B. Kemperman. The essential step in his proof (unpublished) is the 
following inequality: if 


QO = mM, < Me < + = 


and p is any positive integer, then 


° 1 - i l - < 
lim sup p > E+; < lim sup » 2s grams + 
ker i=1 kaw 1 


(mn — 1)(m, — my). 


Ast 


in- 
i h 
ple 
0n 
ive 
the 


ms 
sor 


the 


GENERALIZED LIMITS OF BOUNDED SEQUENCES SY 


REFERENCES 


1. Kk. F. Arens and J. L. Kelly, Characterizations of the space of continuous functions over a 
compact Hausdorff space, Trans. Amer. Math. Soc., 62 (1947), 499-508 

2. S. Banach, Théorie des opérations linéaires (Warsaw 1932). 

3. N. Bourbaki, Espaces vectorieis topologiques (Actualités scientifiques et industrielles, no 
1189, Paris, 1953). 

4. M. Jerison, A property of extreme points of compact convex sets, Proc. Amer. Math. Soc., 4 
(1954), 782-783. 

5. S. Kakutani, Concrete representations of abstract (M)-spaces, Ann. of Math., 42 (1941 
994-1024 

6. S. Kakutani, Ergodic theory, Proc. Internat. Congr. Math., Cambridge, 1950, II, 128-142. 

6a. N. Kryloff and N. Bogoliouboff, La théorte générale de la mesure dans son application 4 
l'étude des systémes dynamiques de la mécanique non linéaire, Ann. of Math., 38 (1937), 
65-113 

G. G. Lorentz, A contribution to the theory of divergent sequences, Acta Math., 80 (1948), 

167-190. 

8. S. Mazur, On the generalized limit of bounded sequences, Colloq. Math., 2 (1951), 173-175 

9. D. Milman, Characteristics of extremal points of regularly convex sets, Doklady Akad. Nauk 
SSSR (N.S.), 57 (1947), 119-122. 

10. J. von Neumann, /nvariant Measures (Institute for Advanced Study, Princeton, 1940 
1941) 


Purdue University 











A NOTE ON ASYMPTOTIC SERIES 
HARRY F. DAVIS 


Introduction. We extend some observations of Popken (2) o1. the 
algebraic foundations of the theory of asymptotic series. The main result is 
the theorem in §5 which characterizes, for a particular function space, a 
class of linear functionals defined in §4. In §3 we discuss another class of linear 
functionals related to asymptotic series. In the first two paragraphs we give 
definitions which render this note self-contained. 

This note grew out of a department seminar led by T. E. Hull, to whom | 
am indebted for many stimulating discussions. 


1. Asymptotic sums. Let cy) + cw + cox? + be a formal power 
series with real coefficients. By an asymptotic sum (as x — 0+) of the series, 
is meant a real-valued function f, having as domain the positive reals, such 
that for all nonnegative integers n, 


f(x) — Co — ix c,x" = 0(x"), r— 0+ 


if f has this property, so also has, for example, f(x) + e~'’”; the asymptoti 
sum of a formal series is therefore not unique. It is known that every such 
series has an asymptotic sum, which can be constructed in the following way. 
Let S denote a nested sequence 


No D Ni D N2 D 


of neighborhoods of x = 0 with characteristic functions uo, 4), ue, . . . . Assume 
further that any positive x is in one and at most finitely many of these 
neighborhoods, so the series 


Cog (X) + C1 (X) + Cotte(x)x? +. 


converges for all positive x and defines a function /(S, x). Then for a suitable 
nest S, whose choice depends on the power series, the function f(S,x) is an 
asymptotic sum of the formal series. If we select a fixed S, and associate 
with each formal power series the corresponding /(.S, x), we obtain an iso- 
morphism between the ring of all formal power series and a ring of functions 
Since it is impossible to find a nest S that gives an asymptotic sum for all 
series, this isomorphism pairs each series with an asymptotic sum of the series 
only over a subring. We do not know if, by some other method, it is possible 
to obtain a correspondence between the ring of all formal power series and a 
ring of functions, which is not only a ring isomorphism but also pairs each 
series with a function that is one of its asymptotic sums 


Received March |, 1956 


mn") 





2 


hun 


neg 


se 








\ NOTE ON ASYMPTOTIC SERIES 1] 


2. A class of linear functionals. Denote by C the collection of all 
functions which are asymptotic sums of formal power series (we do not allow 


negative exponents). If f is an asymptotic sum of the series 


Co H+ CX + Cox* + 


let Lf = cy. Then L, is a linear functional on C. If m is the first nonnegative 
integer such that L,,f is not zero, define ¢(f) = e~"; if L,f = 0 for all m, 
define ¢(f) = 0. Then d(f, g) = o(/ — g) is a pseudo-metric on C; two func- 
tions f and g are asymptotically equal (as x ~0+) if d(f,g) = 0. These 


definitions are specializations to C of those given by Popken in (2) 


3. Continuous linear functionals. /t is natural to consider linear 
functionals on C which are continuous in the sense that Lf, — Lf whenever 
_—f, i.e. whenever d(f,,f)—0. Such a functional L is constant on the 
classes of asymptotically equal functions 


LEMMA |. Every continuous linear functional on C is a (finite) linear com 
bination of functionals of the type described in §2. 


Proof. ‘et L be a continuous linear functional, and let L(x" Gd». It 
i» ~ 0, let c, = 1/a,; otherwise let c, = 0. Let 
(Xx) = Ca + C\Xx T - C,X° 
Let be any asymptotic sum of the formal series cy + cw + Cox* 4 
f exists, as remarked in $1). Then d(f,, f) — 0, so by continuity LU/,) — L( 
Since L(fa4i) = LUf,) + 1 whenever a,,,; # 0, all but a finite number of the 
@», Must equal zero 
Now let d, + dx + dox* 4 be any formal series, and let g be any on 
of its asymptotic sums. Let 
m(x) = dy + dyx + + dx’ 
rhen d(g,, g) ~ 0, so L(g,) — L(g), and since all a,, are zero for m > N (say 
we have 
L (gn) = dydy — dja, T dydy, ) > N 


Cherefore L(g) = L(gy). Recalling the definition of L,, we prove that 
L = aolho + ail, +. + avly, 


completing the proof of the Lemma 


Conversely, any linear combination of the L, is a continuous linear functional 
on C. Since this class of linear functionals has such a transparent structure, it 


seems to be of limited interest. 


4. Asymptotic continuity. 


sul 


DEFINITION. A linear functional L is asymptotically continuous ow 


space of C if, whenever / is in this subspace, and A(x o(x") asx ~ 04 











92 HARRY F. DAVIS 


for some nonnegative integer n, it follows that A*(/) = o(t") as 1-04, 
where h* is defined by h*(t) = L(h(xt)). Here L acts on functions of the 
positive variable x, ¢ being a positive parameter. The transform h* is not re- 
quired to be an element of C. 

It is easily verified that the functionals L, of §2 are asymptotically con- 
tinuous on C; it follows from Lemma | that any continuous linear functional! 
on C is also asymptotically continuous. We do not know if there exist any 
asymptotically continuous linear functionals on C that are not also continuous; 
perhaps the definitions are equivalent for functionals defined and linear on 
all of C. On certain subspaces the two definitions are not equivalent, and we 
now consider one of these subspaces. Certain other subspaces can be treated 
by essentially identical methods, but we give the details for only one of them 
here. 

Let H denote the smallest subspace of C containing the functions |x} 
(m = 0,1,2,...), and {e~“} for all nonnegative real numbers ¢. A general 
element of H is simply a finite linear combination of such functions. It is 
easily seen that the following properties are equivalent in H; a function 
possessing any one of them possesses them all: 

(1) f(x) = o(x") asx - 04, 

(2) f(xt) = o(x") as x — 0+, for all fixed positive /, 

(3) for some positive ¢, f(xt) = o(x") asx — 0+, 

(4) L,f = O(m = 0, 1,...,m), where L,, is as defined in §2. 


LemMa 2. Jf L is an asymptotically continuous linear functional on H and 
if L(x™) = (—1)"c,m! for all nonnegative integers m, then L(e~"*) is an asymp- 
totic sum of the formal series co + cyt + Cot? + 


Proof. ¥or each n, and each f¢, 
t - = l ‘EB = " 
pm ae = o(x"), x — 0+; 
t m 
thus by definition of asymptotic continuity (using the linearity of L) we have 


Lie") — co —et —... — ct" = olf"), ti— 0+. 


The following lemma and the theorem in §5 both show the existence of a 
large class of such linear functionals. 


LEMMA 3. If a(x) is a function of bounded variation possessing all moments 
°o 
| x"da(x) < @, n = (0,1, 2,. 
vt 
then the linear functional 


L(f) = | f(x )da(x) 


is asymptotically continuous on H. 


p 


—E 


A NOTE ON ASYMPTOTIC SERIES Oo 


Proof. \{ h(x) = o(x"), by Taylor's theorem 
h(x) - h®**” (@x)x"*" (n 4. 1)! 


where 0 < 6 < 1, @ depending on x. Since h“"*" (x) is a linear combination of 
terms, each of which is a power of x or a function of the form e~@ for non- 
negative c, each term of h'+” (@xt), for 0 << @< 1,0 <¢ <1, is dominated 
in magnitude either by the corresponding term of h*" (x) or by a constant 
(for x > 1). The hypothesis that a(x) possesses all moments then ensures 
that the integral 


f n° (axt)da(x) 
0 


is finite and bounded for 0 < ¢ < 1. 
We then have 


* “- - 
h (t) - L(h(xt)) i | h(xt) da(x) 
t t 0 l 
t ie (n+1) 
= —— ; a(x) — 0, j— 
=p = J. h (xt) da(x) 0+, 


proving that h*(t) = o(t"), t+ 0+. 

In $1, one method of constructing asymptotic sums was described. We now 
describe another method, based on the theory of the moment problem. This 
generalizes one discussed by E. Borel in (1). Given any sequence ao, a, 
@s,... Of real numbers, there exists (3, p. 139) an L of the form of Lemma 3 
such that L(x”) = a,,. Given a formal power series 


Co + Cl “+ Col?” a see 


we take a, = (—1)"c,,m!; by Lemma 3 this L is asymptotically continuous 
on H. Since L(x") = (—1)"c,m! the function L(e~") is an asymptotic sum 
of the given series, by Lemma 2. 
One of the simplest examples is provided by the convergent series 
1-—t+#/2!-—#/3!+...; 

here a,, = 1 for all m, and we obtain L(x”) = 1 for all m by taking a(x) in 
Lemma 3 to be constant except for a unit jump at x = 1. Then L(e~”) = e~'; 
this is actually the sum of the series. In less trivial cases, the determination 
of a(x) may be a formidable task. Thus this method is difficult to apply; 
in a later paper we shall give more convenient methods based on the observa- 
tion that any formal power series is the series expansion of a Schwartz distri- 
bution. 

Despite these difficulties, the existence of this method shows that any 
linear functional L, on the space of all polynomials can be extended to an asymp- 
totically continuous linear functional on H. The extension is not unique. More- 
over, the following remark shows that the extension may not even be possible, 
if we demand continuity in the sense of §3, and this is one justification for 
introducing the definition of asymptotic continuity. 











94 HARRY F. DAVIS 


There exist asymptotically continuous linear functionals on H that are not 
continuous. For if all asymptotically continuous L were continuous, we would 
have 


Lle rty = 3 (—D 26 \t = > ot", 


n=O nN. n= 


the left-hand side being a function of ¢, and the sum being necessarily con 
vergent. The theory of the moment problem implies that, by proper choice 
of such L, we can obtain any given formal power series on the right side, 
an obvious contradiction. 


5. The main theorem. We prove a theorem which characterizes all 
asymptotically continuous linear functionals on H. Since L is determined 
completely by its values on the basis functions, let 1; be the function defined 
for all nonnegative integers m by L,(m) L(x”), and let Le be a function 
on the positive reals defined by L2(t) = L(e~"). Then L can be identified with 
the pair {Z,, Le}, since any arbitrary pair of such functions determines 
uniquely a linear functional L (not necessarily asymptotically continuous 
Since L2(t) may be discontinuous, it is an easy corollary of this theorem that 
not every asymptotically continuous linear functional on H/ is of the type 


» 


described in Lemma 3. 


{ 


THEOREM. L = {Li, Leo} is asymptotically continuous on II if and only 


Lo(t) ts an asymptotic sum of the formal power series 


> (—1)"L,(m)t" 


rf m! 


Proof. The necessity is a rewording of Lemma 2, so we only need prov 


the sufficiency. Anv element of HT may be written in the form 
m , 
h(x) = ym Anx” + y * a 
m=O . l 


and by the remark preceding Lemma 2, if h(x) = o(x") as x ~0+, we 


compute 


(—1)"2%b, 
&=- > ie ok Seer n 
m 
Thus we have 
- — (—1)"t7L,(m)t" 4 - 
L(h(xt)) = >> by} Lo(tet) > -y + >) amls(m)t”. 
oul banner | } RAP 


If Lo(t,f) is an asymptotic sum of 


= (—1)"L,(m) (t,)” 


' ’ 
m=() m 


each term in the equation is o(f"), t > 0+, proving that L is asvmptoticall\ 
| I | 


continuous 


i 


h 


H 


\ NOTE ON ASYMPTOTIC SERIES 


REFERENCES 


1. E. Borel, Legons sur les séries divergentes (Paris, 1928). 

2. J. Popken, Asymptotic expansions from an algebraic standpoint, Neder! 
Proc. Ser. A 56 =Indagationes Math. 15 (1953), 131-143 

3. D. W. Widder, The Laplace Transform (Princeton, 1946 


University of British Columbia 


\kad 


Wetensch 











ON WARD’S PERRON-STIELTJES INTEGRAL 


RALPH HENSTOCK 


Introduction. In the paper (5), Ward defines an integral of Perron type 
of a finite function f with respect to another finite function g, where g need 
not be of bounded variation. There arise two problems, (a) and (b) below, 
that have not been dealt with in (5). 

If f = j at a countable number of points everywhere dense in (a, 6), where 
f and j are both integrable with respect to g, then f — j can be nonzero on a 
large set of points of (a, 6). For example, if g is continuous and of bounded 
variation the countable number of points can be neglected in the integration 
and we can have f + j everywhere else. But g is more rigidly fixed when we 
know its values on an everywhere dense set, if the integral exists. For example, 
if g is of bounded variation, and so continuous except at an at most countable 
set of points, we can only vary the values of g at a countable set of points. 
More generally, we have problem 

(a) If f is integrable with respect to g, and with respect to h, over the closed 
interval |a, b], where g = h at points everywhere dense in |a, b|, what are the 
properties of the difference g — h and the set of points where the difference is not 
zero? 

This question is partially answered by Theorems 1 and 2, and we obtain 
the following result. 

Let E, be the closure of the set of u for which 
(1) g(u) — h(u)| > «, aqu<b 

Then f must be VBG and continuous on! E,, and mf(E,) = 0. 

However, if f is integrable with respect to g in [a, b], and if g — h satisfies 
(1) and is 0 at an everywhere dense set of points in [a, 6], it does not follow 
that f is integrable with respect to h in [a, 6]. For example, take g = 0 and 
suppose that each set E, contains only a finite number of points and so has 
no limit-points. Then every function f is trivially VBG and continuous on 
fe. = E,, and f(BY contains only a finite number of points. But if the set of 
points where h + 0 does not satisfy Theorem 3 (9), (10), (11), with 7 replaced 
by A, it follows by Theorem 3 that there is a finite function f for which the 
Perron-Stieltjes integral of f with respect to # over fa, b] does not exist. See 
the example of Theorem 5 (38) in §4. 

There is another question of integrability, namely, 

(b) What are the properties of g in order that all bounded Baire? functions { 
are integrable with respect to g in |a, b|? 





Received October 6, 1955. . 

'].e., when we use only the points of Ee. 

24 Baire (Borel-measurable) function is any function that can be obtained from continuous 
functions by using repeated limits 


bt) 


—_—_— ee or 











ON WARD'S PERRON-STIELTJES INTEGRAL 


Question (b) is partially answered in (2), Theorem 2, and 
complete answer in Theorem 3 of the present paper. 


i. Notation. 
finite ina < u < 3b, this interval being denoted by [a 


we 





97 


give the 


We suppose that all functions considered are defined and 


, 6]. The existence of an 


integral or limit is taken to mean its existence as a finite number. If the 


limits exist, 
f(u—) = lim (f(v), f(ut+) = lim f(v) 


psu,agrcugo eu,agu<rego 


Integral signs preceded by (LS), (PS), denote respectively the Lebesgue- 


Stieltjes and Perron-Stieltjes integrals, and we put 


P(v,w) = PU, gv, w) = (PS) | f(u) dg(u), 


f(E) = {f(u):u E} where E is a set contained in |a, b]. A point v in fa, 5] 


is a point of infinite variation on {a, b] of the function f if, for each open interval 


(¢, ») containing v, the function f is not of bounded variation on 


[é, 9) OY [a, 5}. 


It follows that the set W of points of infinite variation on |a, b| of f is closed. 


For if v is not in W there is an open interval (&, 7) containing 
is of bounded variation on 


lé, a] (1 (a, 4], 


S 


and then (£, ) is contained in CW. 


, such that 


The symbols E’, E, CE, mE denote respectively the derived set, the closure, 


the complement, and the measure of a set FE in [a, b]. The interior of FE is the 


largest open set contained in E. 


2. The examination of question (a) 


THEOREM 1. 
everywhere dense in |a, b|, then for allv, wina <v< web, 


P(f, g:v,w) = P(f,h;v,w) + [f(g — A)]P. 


Proof. 


dense in [a, 6]. Let M, and M, be a major and a minor function, in Ward's 


sense, of f with respect to g in [a, 6] and take wu in [a, 5] 


5:(u) > 0 depending on u, M,, Mo, such that 


If P(f{, g;a,6) and P(f,h: a,b) exist, and if g 
J+ 8 ; g 


h at points 


It is enough to assume that h = 0, so that g = 0 at points everywhere 


Then there is 


(2) [Milf > f(u)[ght > (M2)i, O<t—u<i,(u), 
(3) [Mi]: < f(u){g)s < [M;}i, O>EtE-—u> — 5,(u). 


\s in (2), $2, the proof of Theorem 1, we can prove that in each |v 
: I ] 


i finite number of points 


, w| there is 








98 RALPH HENSTOCK 


v= a = Uy <a <... Ka, = Uy = W, 1 SU, Ka, (p =2,...,n-1 


such that 


g(a,) = O(p = 1,...,m — 1), ay — apr < b1(ty) (p = 1, n 
Thus (2), (3) are satisfied with u = u,, —§ = a,, and u = uy, & = ay, re- 


spectively, and we obtain 


[Mile = 2 (Mil, > [felt > Do [Me], = [Ale], 
p= yon | 


where |.1/], stands for 


M (ap) — M(ap_). 


Thus as P(v, w) exists, the Theorem must be true for = 0, and so generally 


THEOREM 2. Jf, forallu ina <u <b, 
(4) P(f,g;a,u) = [fgh, 
then (5) f is VBG and continuous on E., and (6) mf(E.) = 0. where E. is th 
set of u for which 


g(u)i>e axcu<cb,e>QO. 


COROLLARY. Jf (4) ts true, and if E, contains an interval \&, | for some 


S 


« > 0, then f is constant in |&, n}. 


From Theorem 2 Corollary we can easily prove Theorem 1 of (2). 

To prove Theorem 2 let a < u <v < band let M;, M, be arbitrary major 
and minor functions of f with respect to g in Ward's sense, and write 
x1 = M; — M,. Then x is monotone increasing. Now, for fixed u and for 
sufficiently small and positive v — u, both functions 

f(u)(gli, P(u, v 
lie between 
[Ms]., (Mali, 
so that 
P(u,v) — f(u)lglu! < (xis. 
Substituting in the value of P(u,v) from (4) we obtain 
gv) [Flu| < (xa). 


Hence there is a 60.(u) > O such that il 


we have 


(7) Lfle| < & "foal. 





the 


me 








ON WARD'S PERRON-STIELTJES INTEGRAL 


Similarly for v < wu. If 


w | EB. O<w—u < be(u), 
then there is a v satisfying 
v E.,, O0O<v—u < 6o(u), 
and arbitrarily near to w, so that by (7), 
Uflel < IOAN + LATE) < Daa + fra, 
(8) Ll < €"Lal® < "fall, 
lim sup |[/¥| < «“"[Lxall, lim f(w) = f(u), 


is x1 (d) xi (a) is arbitrarily small. 
Similar results hold for 


w<uuet Ee’ ,wel Eywou, 


so that f is continuous when we only use the points of the derived set of F,,. 
\s the other points of E, are isolated, f is continuous on E 

l'o show that f is VBG on E, we use the method of the first part of the proof 
of (5, p. 592, Lemma 6) and we employ only points of F,. The relevant 
inequality is the first one in (8). 

lo prove (6) we first add 6(u — a) to x,(u) if necessary, to ensure that 
x, is strictly increasing. The constant @ > 0 can be arbitrarily small. Then as 
in (5, p. 581, Lemma 3) we prove from the first inequality of (8), and the 


similar inequality when w < u, that 
“fp ‘ 1 » 
m*f(E.) < 2€"[ xl, 


* denotes outer measure. The factor 2 occurs because of the w+ in 


where m 
8). As the right-hand side is arbitrarily small we obtain (6) 

To prove the Corollary we note that by (5), f is continuous on |£, 7]. Thus 
if {({E, n]) contains two distinct points it contains the whole interval between 


the points. This is impossible by (6) 


3. The integrability of Perron-Stieltjes integrals. |1 this section we 
prove two theorems, completely answering question (b). We begin with a 
lemma needed in the proof of the converse of Theorem 3 


Lemma. Let F be a sequence {I,| of open intervals, and let H, be the set of 
points of \a, b| lying in at most p intervals of F. Then all the intervals I,, covering 
the points of H, can be put into at most 3p sets of non-overlapping intervals 


We can define a sequence {£,} of points of 7, such that their closure contains 
I],. Each interval J, covering a point of H, will then also cover at least one 


t., and conversely. Thus we need only consider the intervals covering the ¢ 


s 











100 RALPH HENSTOCK 


We put the gth interval of the sequence {J,} that covers £, into the set S_. 
Then 1 <q < p, as &, lies in H,. Suppose that the intervals J, covering 


f1,...,&-1 have been arranged into sets S,(1 < ¢ < 3p) of non-overlapping 


intervals, and let £, lie between &, and £, for s <r, t <r, with no &,(¢g <r 
between &, and ¢,. Then there are at most p intervals J, covering £,, and at 
most p intervals J, covering £,, so that at least p of the sets S;,... , Ss», say 
T,, ..., I», will be free from intervals /, that cover &, or —,, and so will contain 
no interval lying in (&,, £,). The intervals J, covering £, that have not already 
been put into sets S,, cannot cover £, nor £,, and so must lie between £&, and 
£,. We can therefore put these intervals into some or all of the sets 7), ... , T,. 
Similarly if 


&, < min &, or &, > max &,, 


in which case one of &,, &, is missing. Hence the result is true for &,...,¢ 
It is true for £; and hence true in general. 


THEOREM 3. If, for a given function j, for all bounded Baire functions | 
defined in \a, b|, and for all u in |a, b|, the integral P(f, 7; a, u) exists equal to 


[fas 
then the set of points u in a <u < b, where j(u) + 0, can be divided into two 


sequences \u,'| and {d,}, with the properties 
] 


(9) an 


n=! 


(10) surrounding each d,, there is an open interval I(d,) = (d,, d,) contained in 
(a,b) such that each point of \a, b| can lie in an at most finite number of the I (d, 


(11) there is a monotone increasing bounded function x such that 


x (d,+ )— x(d,) > |7(d,)|, x(d, x(d,—) > |7(d,)|. 


Conversely, if j satisfies (9), (10), (11), and if f is bounded in |a, b|, then 
P(f,7; a, u) exists and is equal to 


file, 


foralluina<u<b 


To begin the proof of the first part of Theorem 3 we replace g by j in 
Theorem 2, obtaining from (5) that f is continuous on E,, where E, is the set 
in which |7| > e. But, for each u in [a, b], the set of bounded Baire functions 
f includes the function equal to 0 in {a, u), equal to 1 at u, and equal to 2 in 
(u, b]. Hence each point of E, must be isolated, and E, is finite. This is true 
for each « > 0. Hence taking «—' = 1, 2 ., we obtain 


(12) 7 # 0 only at a countable set of points }w,} 


1Wnais 


(13) 7(w,) ~Oasn— o~. 











wre 


in 
et 
ns 


in 


ue 























ON WARD'S PERRON-STIELTJES INTEGRAI 


Also, as £, is finite, 
(14) 7 ts bounded. 


We now wish to find a strictly increasing function x and a function 6 > 0’ 
defined for all u in a < u < b, such that for u-—-i<weu<v<Uutéi, 
acw<v<d, 


(15) xh. > |7(%)|, 
(16) [x] > |j(w)|. 

There is in Ward's sense a major function P(f,j;a, u) + x2(u) of f with 
respect to 7 in |a, b], where x» is monotone increasing and bounded in |a, 5], 
with x2(a) = 0. Thus, if we substitute in the value of P(/, 7; a, u), we find 


that for a < u < 6 and for some 6; = 6;(u) > 0, using Ward's definition of a 
major function, 


(17) [xele > J(v) LT (ux<v<utib;,a<vu< db), 

(18) lxele > j(w) Ly (u>wr>u—b;,,acw< b). 
We now take f = — sgn j, where sgn a = |a|/a(a # 0), sgn 0 = 0. Then 

il x3, 64 are the corresponding xe, 63, and if the u of (17) does not lie in |w,}, 

so that j(u) = 0, f(u) = 0, we obtain, foru <v <u+6,a<v <b, 

(19) [xsl. > j(v)|. 

Similarly let x4, 65 be the corresponding x2, 6; when for f we take sgn j, 
and let the u of (18) lie outside the sequence {w,} so that j(u) = 0, f(u) = 0. 
Then 
(20) [xale > |7(w)|, u>wr>u—b,acwc db. 


By (13), 7(w,+) = 0. Thus if we put 
Xs(u) = > 2" (u ¢ {wea}) = xs(wp—) +2” (u=w,,p =1,2,...) 
we obtain 
x5s(Wp+) — xs(w,) = 2” > O = |j(w,+)), 
xs(Wp) — xs(wp—-) = 2.7 > 0 = |j(w,-— 


and there is a number 6, = 6(w,) such that x5(u) satisfies (15) and (16) at 
u = wW,, with x replaced by x, and 6 by 4,. 
Using (19), (20) also, we see that to obtain (15), (16) foralluina <u <b 


and a strictly increasing function x, we need only take 
x(u) = x3(u) + xa(u) + x5(u) + u a. 
We now define the points d, in (a, 6) as those for which 


(21) j(d,)| > x(dy+) x(d,), |j(d,)| > x(d, x (d, 














102 RALPH HENSTOCK 


The other points {u,} of }w,} then give 
> J(tUn)| < > { x(tn+) — x(te-—)} <Ixk < @, 
n==1 n=l 

so that (9) is satisfied. 

If wu <d, <u-+ 6(u) for some u,d,, we have (15) with v = d,. Let d, 
be the upper bound of all u < d, satisfying (15) for fixed v = d,. If there is 
no such uw, put d, = a. Then 
(22) x (d,,) x(d,—) > |7(d,)|, 


while if d, > u > d,, we have 
(23) x(d,) — x(u) < |7(d,)!. 


By (14), 7 is bounded, so that we can take a convenient finite value for 
x(a—) to fit the cases.when d, = a. From (21), (22), d, < d, 
Similarly we can define d, > d, such that 


(24) x (d, +) — x‘d,,) > j(d,)|, 
while if d, < u < d,, we have 
(25) x (u) x(d,) < |j(d,)}. 


Results (22), (24) prove (11). We now suppose that (10) is false, so that 
a point « of |a, d] lies in an infinity of the open intervals 


I(d,) = (dy, dy) C (a, b). 
Obviously u # a, u # b. Also by (23), (25), (13), 
x(d,—) — x(d,+) < 2|7(d,)| — 0 


as n— . Hence as x is strictly increasing, d, — u and d, — u, for the sub- 
sequence of » for which d, < u < d,. Hence the corresponding subsequence 
of {d,} also tends to u, so that for certain v — u, 


x(v) — x(u)} < |7(v)}. 
This result contradicts (15) or (16). Hence (10) is true, and the first part of 
Theorem 3 has been proved 


We now prove the converse. Let the discontinuities of x in fa, 6] occur at 
the points v,(m = 1,2,...). Then we have 


Dd txt) — x(m—)} < [xt < @ 


n=l 


so that, given « > 0, there is an integer m» such that 


26) Dd {x(m+) — x(m—)j <e. 
Then there is an integer , such that, for m > n,, d, is not one of the points 
v(g = 1,...,% — 1). 





fal 


or 


—_ 


——$_——__ 5 





ON WARD'S PERRON-STIELTJES INTEGRAI 105 


We now let F in the Lemma be the family of intervals /(d,), and we take p 
so large that 
(27) mx}{\|a,6| — H,} < «. 


This is possible since by (10), 


a,b) = U ,. 
pa 
By the Lemma there are 3p sets S, of non-overlapping intervals /(d,) that 
together cover H, — Hy. There is an integer ¢ > m,, and depending on .e, 
such that for each g in 1 <q < 3p, 


(28 >! x(d,+) — x(d,—)} < €/(3p), 


where the sum is taken over those intervals of S, with n > /, as the sum for 
n > 0 is not greater than x(b) — x(a). The integer ¢ can also be chosen, by 
9), so that 


(29) ph J(Un)| < €. 


>If 


Let S be the set formed from those intervals of the S, with m > ¢ and 
l<q< 3p. Then 


is a union ol intervals. kor if w lies in ja, b| — Z/, let J be the intersection of 
the first (p + 1) intervals /(d,) covering u. Then J is open and contains u, 
and 


J C |a, b| — H,. 


We add an at most countable number of points, if necessary, to obtain from 


a,b] — H,} US a union U of open non-abutting intervals, and we put 
(30) x6(%) =) i} x(6+) x(a—)} + e(u — a) (b a) + > oo 2\7(u,)|, 


where >>; denotes the summation over the intervals (a, 8) of U C\ (a, u 

changing B+ to 6 if 8B = u; and >>» denotes the summation over all n > / 
such that u, <u, adding |j(u,)| if p >t and u = u,. Then xz is strictly 
increasing, and from (26), (27), (28), (29), 


(31) Ixele < Ge. 


Now, by definition, the points of H» are not covered by any interval '(d, 
lim > tand if J(d,) covers a point of H, — Ho, then J(d,) will lie in one of the 


S,, and so in S, and so in U. It follows that x(d,) x(d,—) will occur in 
>: for u = d,. If n > t and if [(d,) does not cover a point of H, HH, then 
I(d,,) will lie entirely within |a, 6] —H,, and so in U, and again, x(d,) — x(d, 


will occur in }>,; for u = d,. Thus by (30), 
(32) xe(d,) — xe(d,—) > x(d, x(dn—) > |7(d,) (n > t). 


Similarly for the result with d,+, so that x» satisfies (11) for all » > / 











104 RALPH HENSTOCK 


Now each point u of [a, 6] lies in an at most finite number of the /(d,), say 
1(&:),..., T(&,), where &,..., &, depend on u. Let the sequence {»,} include 
all points of the sequences {u,}, {d,}, {d,}, {d,}, and let « be outside {»,} 
We take 5, = 5s(u) > Oso that (u — &¢, u + 5s) does not include 


i aso Ethie ee : E, 
Then by (32), for u < d, < min(b, u + ¢), 
x6(d,n) — xe(u) > x(d,) — x(d,—) > |i(d,)|, 


since d, > u. If u, lies in u < u, < min(b, u + d¢) then m > ¢, and by (30), 
X6(U,) — xe(u) > J(u, 
If v is neither in {z,} nor in {d,} then for u < v < min(b, u + de), 


xe(v) — xs(u) > O = i(v)|. 


Hence, if u is outside }n,}, 


(33) 6(v) — xe(u) > \7(v)I, u<v< min(b, u + d¢ 
Xx Xx 2? |. 

Similarly for all v in u > v > max(a, u — ds). To deal with the case wher 
u = », for some n, we put 

X7(u) = xe(u) 4+ +2 (wu ¢ inn} 

tnt 
‘ =p 
X7(mp) = xz(me—) + €.2 (p = 1,2, 


As in the part of the proof that follows (20), we obtain a strictly increasing 
function x7 satisfying (33) for all u, and, for suitable 6; > 0, for 


u<v < min(b, u + 57), 
and similarly for v <u. By (31 
(34) [x7]e < 7e. 
Now suppose that |f| < A. We put 
M;(u) = (f7 + 2Axic. 
Then from (33), 


[Ms]. — f(u) [sh (fuj(v) + 2A[ x7] 


> [flij(v) + 2Alj(v)| > O(u < v < min(6, u + 67) 


The inequalities are reversed when u > v > max(a, u — 67), so that M; is a 


major function, in Ward's sense, for f with respect to 7 in [a, b]. Similarl 
M,(u) = (f7 — 2Axzh 
is a minor function, and by (34) 


M;(6) _ M,(6) 7 tll x la < QR Ae. 


ON WARD'S PERRON-STIELTJES INTEGRAI 105 


By choice of « > 0 this can be made arbitrarily small. Hence there exists 
P(f,j;a,u) = [file 
proving the converse in Theorem 3. 
THEOREM 4. IJf, for a given function yg, and for all bounded Baire functions 
f in |a, b|, the integral P(f, g;: a, b) exists, then 


(35) g(u—) exists in a <u < b, g(ut+) exists in a <u < b, and both are of 
bounded variation in those ranges; and the function j satisfies Theorem 3(9), 
(10), (11), where 


(36) j(a) = g(a) — g(at+), 7(b) = g(b) — g(b- 
j(u) = g(u) — $4g(ut+) + g(u-—)} (a<u<b 


Conversely, if g satisfies (35), and if the j defined by (36) satisfies Theorem 
3(9), (10), (11), and if f is a bounded Batre function in \a, b|, then P(f, g; a, b) 
exists and is equal to 


1g(b) — g(b—)} f(b) + tg(at+) — g(a)} f(a) + > f(u){g(ut+) — g(u—)} 
+ (LS) | f(u) dg,(u), 


where 


g-(v) = glv-) - > > g(ut) — g(u—)j (a < v < 6), g(a) = glat 
a ua 
The result (35) is proved in (2), Theorem 2, using only the hypotheses of 
the present Theorem 4. From (35) we see that g — j is of bounded variation 
in [a,b], so that P(f/, g — j;a, 6) exists. By hypothesis P(/,g;a 6) exists 
Hence so does P(f, j; a, 6). Also, from (35), 


lim g(w—) = g(u—), lim g(w+) = g(u—), 
w = 


w=u— 
so that from (36), 7(u—) = 0. Similarly j(u+) = 0. If E, is the set in 
a<u <b where j > « > O, and if E, has a limit-point £, then 


lim sup j(w) > e. 


rhis contradicts 7(é—) = 0 = 7(+), so that E, has no limit-points and so 
must contain only a finite number of points. Thus taking « = n~'(m = 1, 2, 


the set where j > 0 is at most countable. Similarly the set where 7 < 0 is at 
most countable. Hence by Theorem 1, 


P(f,j;a,u) = [file 
so that the first part of Theorem 3 completes the first part of Theorem 4 
To prove the converse in Theorem 4 we need only use the converse in 
Theorem 3 and the fact that g — 7 is of bounded variation in fa, 6], and 
(4, pp. 208-209, Theorem 8.1)). 








106 RALPH HENSTOCK 


4. The points of infinite variation of j.. We now suppose that 
(37) i(u—) = 0 (a <u < bd), j(ut) = O(a <u < BD) 


Let 7, be the union of the interiors of all closed intervals J contained in 
la, 6], such that P(f,7;J/) exists for all bounded Baire functions f, adding 
one or both of a,b to 7; according as one or both of [a,a + «], |b — «, d| 
are intervals J for some « > 0. Also put T = CT; () {a, 6]. Let W be the set 
of points of infinite variation of 7 


rHEOREM 5. Jf J is a closed interval, there is a function j satisfying (37), 
such that 
(38) J=W,J=T. 


If Q is a closed nowhere dense set, there is a function j satisfying (37), such 
that 
(39) r=wW=Q, 


and there is another function j satisfying (37), such that 
(40) T=9¢% W=Q, 
where @ is the empty set. 

We begin by supposing that 


(41) the set of points \v,} in \a, b| can be put into one-one correspondence with 
the points (2g + 1)2-” (O<q<2?"'; p=1,2,...), the order of the points being 
preserved. 


Then we define j(v,) = p~' when v, corresponds to (2g + 1)2-’, and j(u) = 0 


when u is outside {v,}. Such aj satisfies (37), as only a finite number of j(2, 
are greater than any given positive ¢. If a x exists satisfying Theorem 3(10), 
(11), we can suppose that 

b 


(42) Ixle = B, [xlu > v — &, 


for alla < u <v < b. Then the set of intervals /(d,) for which 
x(d,+) — x(d,—) > 2 p 


must be such that any non-overlapping and non-abutting subset has at most 
!pB members. Hence any non-overlapping subset has at most pB members 
The points of {v,} that are not in {d,} are points {u,} satisfying Theorem 
3(9). It follows that for some integer r, there is a point do; in {d,} with 


x(doit+) — x(du—) > 2/r 


such that /(do,) contains at least two different points £,, & of |v, | corresponding 


to points (2q¢ + 1)2-’ with the given r. Hence 


, has / . 
—; = I (do ) | 1) } \f1, 





ch 


st 
rs 
m 


ne 


ON WARD'S PERRON-STIELTJES INTEGRAI 107 


is not empty, as there are points of {v,} between each two points of |v,{ by 


(41). Since &, £2 lie at a positive distance from the ends of /(do,), and since 


d, —d, < x(d,+) — x(d,-—) -0 
asn — @, by (42), (10), and the bounded variation of x, there is an my such 
that ifm > ms and d, Q, then 


I(d,)' — I (dy\) 
We can now repeat the construction, defining dos, dos, , and 


I(do,) D I(dez:) D ... D I(do,) D 


\s }do,} is a subsequence of |d,} we have dy, — do, ~O0asn— @©, and hence 
for a point u in (a, b), 1(do,) — u. This w lies in an infinity of the intervals 
I(d,), contrary to (10). Hence in this case there is no x satisfying Theorem 
3(10), (11), so that for some bounded Baire function /, P(f/, 7; a, 6) cannot 
exist 

\ similar result is true for each interval J containing points ol jv,{ im its 
interior, by (41). Hence 

13 D> irm}" 
since by (41) each point of {z,j}' is the limit-point of a sequence of intervals 
of 7 

To prove (38) let J be the interval |a, 8]. Then the points 


vy =art (gs a)(2qg + 1)2-” O<q<?':p = 3 
will satisfy (41), and by (43), 
im} =J=7 


To prove (39) we take the points v, to be the centres of the intervals / 
complementary to Q in |a, 6]. That {v,} so defined satishes (41), can be shown 
by (3, p. 57, Proposition 20). Then by (43), 


and (39) is proved. 

lo prove (40) let d;, be the centre of the mth interval J, = (a,, 6,) comple 
mentary to Q in |a, b]. Next, let de,; and ds,» be the centres of (a,, d,,) and 
(d,,, 8B»), respectively, calling these two points the points of the second stage 


We continue this process of continued bisection to the stage n*. If d,,, is a 


point of the pth stage in J, put j(d,,,) = n~?2-%, with (dyay, pny) as the 
(p — 1)th stage interval with centre d,,,. If this is done for |< p <n 
(n = 1,2,...) with 7 = O elsewhere, and 1! 

x(dyng) — x(dpne) = 2-* 2°” 


we have 


x (G,,) x (a, ) n 2; 











108 RALPH HENSTOCK 


and the construction of a strictly increasing x satisfying the required condi- 
tions is possible. Each point of [a, 6] lies in an at most finite number of the 
I (dyn), as it lies in at most mn? in the interval J,. Finally, over all the points 
Goes OH J, 


> I(dong) - 5 


Thus 7 is empty and W = Q, proving (40). 


THEOREM 6. Let j satisfy (37), with T, W as defined just before Theorem 5. 
Then: 
(44) T is perject; 
(45) WD T; 
(46) The interior of W is contained in T; 
(47) If QC R are two perfect sets in |a, b| with the same interior, there is a j 
such that T = Q, W = R; 
(48) In order that T should be empty, it is necessary but not sufficient that the 
set of points {4,} of Theorem 3 should be scattered.* 

COROLLARY 1. Jf W is at most countable then T is empty and P(f, 7; 4a, 6) 
exists. 

COROLLARY 2. No structural property of W can be both necessary and sufficient 
for T to be empty. 

By construction, 7 is closed. Thus to prove (44) we have only to show that 
T has no isolated points. Suppose on the contrary that v is an isolated point ol 
T. Then there are points a, 8, such that a < v < 8, with [a,v) and (», 8] 
in 7;. Putting 
v, = v—(v — a)/(n + 1), 
we see that 

Py = P(f, I: Uns Un41) 

exists for each » and each bounded Baire function {/. By hypothesis 7 = 0 
except at an at most countable set of points, so that by Theorem 1, 


Py = f(Un+1) J(Un41) — f (On) 7 (n)- 
Hence for each « > 0 there is an increasing function x, such that 


[fila + xs(u), [fila — xs(u) 


are a major and a minor function, respectively, in a < u <v, in Ward's 
sense, with 


X8(Un41) — xa(ta) < €2-", xs(u) — x(a) < 2e. 
If we set xs(v) — xs(v—) = e, then asf is bounded, say by A, and j(v—) = 0, 


we have 
[xslu > € > 2Alj(u)| > [flrs u 


““Zerstreute” (F. Hausdorff), ‘‘separierte’’ (G. Cantor), ‘‘clairsemé” (A. Denjoy) 


for 


0 
B 





ON WARD'S PERRON-STIELTJES INTEGRAL 109 


forv — ds < u < v and some és > 0. Hence 


fi + xsl > f()[ik, and [fjJa + x0(u) 


is a major function in [a, v]. Similarly 


[file — xe(u) 


is a minor function in [a@, v], and 
[xslo < Be. 


Thus P(a,v) exists. Similarly P(v, 8) exists, so that by (5, pp. 585_586), 
property J, P(a, 8) exists, and v does not lie in 7, contrary to hypothesis. 

If j is of bounded variation in the closed interval J then P(/, 7; J) exists 
Hence (45) is true. Further, if W contains an interval [t, 9] let J be a sub- 
interval. If P(f, 7; J) exists for each bounded Baire function /, then by Theorem 
|,and then Theorem 3(10), the set of points {d,} in J has the Denjoy property 
(see, e.g., (1), chap. III, p. 140). Hence it is scattered, and so is nowhere 
dense in J. It follows that W must be nowhere dense in J, as the points {u, | 
of Theorem 3 add nothing to W. This contradicts the fact that / is contained 
in W, so that [£, 9] is contained in T, and T contains the interior of W, proving 
(46). 

To prove (47) we first take the closure J, of the mth interval of the interior 
of Q, and construct a function }, satisfying (37), (38) with J = J,. Then we 
construct a function j» satisfying (37), (39), with the Q there replaced by the 
present Q less its interior. Finally we construct a function j_, satisfving 
(37), (40), with the Q there replaced by the closure of R — Q. Then 


Lis 
n 1 
satisfies the conditions of (47). 
For (48), if JT is empty then by Theorems | and 3(10), the set of points 
\d,} in |a, b] has the Denjoy property, and so is scattered. But for the function 
satisfying (37), (39), the set of points {d,} in [a, b] is also scattered, so that 


(48) follows. 
Corollary | follows from (44), (45), and Corollary 2 from (47) 


REFERENCES 


\. Denjoy, Legons sur le calcul des coefficients d'une série trigonométrique (Paris, 1941, 1949) 
R. Henstock, The efficiency of convergence factors for functions of a continuous real variable, 
J. London Math. Soc., 30 (1955), 273-286. 
. J. E. Littlewood, The Elements of the Theory of Real Functions (Cambridge, 1926 
. S. Saks, Theory of the Integral (Warsaw, 1937). 
\. J]. Ward, The Perron-Stieltjes integral, Math. Z., 41 (1936), 578-604. 


w= 


St be ee 


Queen's University, 


Belfast 











A GENERALIZATION OF THE 
CAUCHY PRINCIPAL VALUE 


CHARLES FOX 


1. Introduction. If a < u < band n > 0 then 
a) | f(x) dx 
Ja (x 
is a so-called improper integral owing to the infinity in the integrand at 
x = u. When n = 0 we have associated with (1) the well-known Cauchy 


principal value, namely 


(2) lim 


es+0 a is =— @) 5 = @ 


j yg f(x) dx ( f(x) dx | 
i. 4 T Juve | 


Hadamard (1, p. 117 et seq.) derives from an improper integral an expression 
which he calls its finite part and which, as he shows, possesses many important 
properties. For given f(x) this finite part is obtained by constructing a function 
g(x) so that the following limit exists (1, pp. 136 and 138): 


; ""Sf(x) — g(x)t 
(3) lim | ) m( ax 
rw-00_ | (x — & 
Hadamard confines himself to the case when m = n + 3} and n is a positive 


integer. In this paper we shall use Hadamard’s idea to define a principal value 
for (1) in the case when 2, in (1), is a positive integer greater than zero 
When n = 0 the definition will reduce to (2) 

The principal value so defined enables us to generalize several well-known 
theorems. We shall illustrate this generalization later by discussing the 
Hilbert transform (4, p. 120 (5.1.11)) and the Plemelj formulae (2, p. 42 
(17.2)) 


2. The principal value of (1). For the rest of this paper will always 
denote a positive integer or zero; f‘(x) will denote the ith derivative of f(x 
with respect to x and the principal value of (1) will be indicated by means of a 
prefix P before the integral sign. 

For integration along the real axis with a < u < > the principal value ol 
(1) is defined as follows: 


ad = ‘ eu—e x ™ +d 7 . { 
(4) P { ele = lim 4 { Cs a2 &. + fe. — H,(u,e)( 


! 
(x — 0) 
where : 
Received March 14, 1956 


m 


” 


mn 


A GENERALIZATION OF THE CAUCHY PRINCIPAL VALUE 111 


(5) Hy(u,e) = 0, n= 0, 


<= f‘(u)J1 — (-1)""'t 


(6) H,,(u, €) = - > a : n> 0. 
jan 1! lin — ie ry 


When » = 0 this principal value evidently reduces to (2 


THEOREM 1. Jf (i) f(x) possess derivatives up to order n in (a,b) and (ii 
f"(x) satishes a Lipschitz (or Holder) condition, namely 


(7) f"(x1) — f"(xe2)| < Alx, — xol“ 
whenever x, and x2 both lie in (a,b), A is a constant and 0 < w < 1, then the 
limit on the right hand side of (4) exists. 


Proof. Consider the expression E given by 


j (x — u) 
(8) KE = )f(x) — flu) - 7 f'(u) — 
(x — u)" ‘fm 1 + | 
- u : 
(n — 1)! 17 u 
From (i) and the mean value theorem, we see that 
mn 
, t 
(9) E = J ) . 
n! (x — u) 
where / lies between x and uw. On writing 
sf"(t) — f"(u)t f"(u) ; ' 
(10) E=\— — =Ei +E; 
n' (x — u) n' (x — u) 
we see from (ii) that, since ¢ lies between x and u, 
(11) E,| < Alx — |", —-l<yp-1<0 


Consequently it follows from the usual theory of the Cauchy principal value 


(2, chap. 2) that 


je pe) 
lim ) + ( Eds 
++ va vute 
exists. 
Denote the part of (4) inside the brackets {|} by R. Then, apart from 
functions of #, we see that 


j owe s+ ( 
(12) R - ) | + | ( Edx 
va vun+e 
contains only negative powers of x — u in its integrand. On performing the 
integrations we find that (12) is independent of ¢ and it therefore follows that 


lim R 


«.+0 


must also exist. This completes the proof of Theorem | 











112 CHARLES FOX 


The definition (4) is easily extended to the case when the integration is 
taken along the arc of a plane curve. The variables are then all complex, 
a and 6 correspond to the end points of the arc of integration, x is any point 
on this are and uw is a fixed point on it. Draw a circle centre u and radius 
e(>0) so small that the arc of integration is cut in two points only, u — «, 
between a and u and u + e€, between u and b. The principal value is then 
obtained by making the following changes in the right hand side of (4) 


(i) in the first integral replace u — « by u — «&, (ii) in the second integral 
replace u + ¢ by u + e2 and (iii) in H,(u, €) replace 

ji — (-1)""'t ; oe (—1)""'t 
(13) Soul wed) i, 


lin —i)e / (a = ite e ik 


Theorem | holds for this definition also if the interval (a, 6) is replaced by the 
arc of integration from a to b. 
3. Some properties of the principal value (4). 


THEOREM 2. With the same conditions as in Theorem 1 and 0 < m <n, 
n > 1, we have 


(14) P f f(x)dx > (n—i-I1)!) f@) __ fd) { 
Ja (x — uu)" A n! \(a — u)*' (b — u)’ if 
ae ' ed rm e . 
, (n—m)! , f f (<) dx 
n! Ja (x — u 


Proof. Denote the part of (4) inside the brackets {| } by R. Denote the result 
of replacing f(u) by f'(u) and m by m — 1 in the right hand side of (6) by 
K,,-1(u, €). On integrating the two integrals in R by parts we have 


- f(a) f(b) f(u — e) f(u ) 
(15) R= [- - “on a + 7 — H,,(u, €) 
n(a — u) n(b — u) n({(—e) n(e) 
a 1) ("‘f' (x) dx f'(x) dx ; { 
— K,, 1(%, €) + ) Tr | _ = K,, 1 (uM, €) ( ° 
n n\. (x — 2) e « (X — 2) 


Denote the sum of the 3rd, 4th, 5th, and 6th terms on the right by S. From 
condition (i) we may expand f(u — «) and /f(u + e), by the mean value theorem, 
in powers of ¢ as far as the mth derivative of f(u). It is then easily found that 
all the coefficients of the various powers of « vanish, so that S is equal to the 
remainder terms only. Consequently 


—f"(t) +f" (hr) 


(16) S = , 
n(n!) 


where ¢, lies between u — ¢ and u, while ¢, lies between u and u + e. From 
the Hélder condition it follows immediately that S-—0 as «— +0. On 
letting « — + 0 in (15) we see from (4) and the definition of K,_1(u, €) that 


(17) _ 


” 


n(a — u)" n(b — nu)’ n 


P [ f(x) dx f(a) f(d) l p [ Lie 


a1 = 
a (X — t) »(x — u 


the 


{ 
e)(- 


rom 
rem, 
that 
the 


\ GENERALIZATION OF THE CAUCHY PRINCIPAL VALUE 113 
This is evidently (14) with m = 1. On applying (17) to the principal value 
on the right hand side of (17) we establish (14) for the case m = 2. By con- 


tinuing this process m times we establish (14) for every integral value of 
m <n. 
Theorem 2 is also true for complex integration: 


THEOREM 2A. If conditions (i) and (ii) of Theorem 1 hold with a = x 
and b = + o, (iii) for large |x|, f"(x) = O(x"-"-”) (bp > 0,m <n), where O 
is the Landau order symbol and (iv) f‘(x)/x*-'-0 (i = 0,1, m— | 
whenx— © orx— — @, then 

"= f(x) dx (n — m)! i f"(x) dx 
(18) P | “ —Ty} = P era (m <n). 

e (x — u) n! e (x — u) 

Proof. From (iii) the integrals in (14) converge when a = — © and 

b = + o and so, from (i) and (ii), (14) is true with — © and @ as the limits 


of integration. From (iv) all the terms in the summation sign of (14) vanish 
for such infinite limits and so (14) reduces to (18). 


THEOREM 2B. If f(z) ts one valued and analytic in a domain which includes 
the simple closed Jordan curve C and its interior then 


s 


. 
(19) P - |= = 5 (m <n), 


J-(z—u)" n: 


f(z)dz _(n—m)! Pp f(z) dz 
J (2% 


where the integrals are taken once round C and u is a fixed point on C 


Proof. Since f(z) is analytic it possesses derivatives of all orders, each 
derivative satisfying a Lipschitz condition. Hence (14), with integration 
along the contour C, is true when f(z) is analytic. Since C is closed the end 
points coincide, i.e. b = a. Hence since f(z) and its derivatives are one valued 
it follows that the terms in the summation sign of (14) cancel in pairs, leaving 
us with (19). 


4. An extension of the Hilbert transform. 


THEOREM 3. If the conditions of Theorem 2A hold, and (v) for large |x 
f"(x) = O(x-**) for pb > 0, and 


sks i f(x) dx 
(20) g(u) = ) Gow 
where u is real, then 
(20.1) g(x) € L*(— @, @): 
! = , 2 
(21) {"(u) = — =P g(x) dx ; 
T Jaa (X — &) 


(22) | {e(x)}? dx = af {f"(x)}* dx 











114 CHARLES FOX 


Proof. Since Theorem 2A holds we may use (18). On using (18) with 


m =n it follows that (20) can be written in the form 
I *” f"(x) dx 

(23) g(u) = ae \ : 

w(n! . is = @) 
On replacing x by u — ¢ in the range — © <x < u — € and x by u + 
in the range u + «€ < x < @, (23) becomes 
r l %a a + f’ 
, (u t) — jf (we — £) 
(24) (u) = lim | dt. 

& a(n!) 400 t 

From (v) f"(x) € L?(— ©, ~) and so we may apply Hilbert's transform 


theorem (4, Theorem 91, p. 122) to (24). The truth of (20.1), (21) and (22), 
including the existence of the principal value on the right of (21), then follows 
immediately. 

Evidently we may look upon (20) as an integral equation with g(x) as a 
known and f(x) as an unknown function. The solution is then given by (21 
Theorem 3 can also be established under a different set of conditions if we use 
M. Riesz’s version of the Hilbert Transform (4, p. 132). 


5. An extension of the Plemelj formulae. Let A denote an arc in the 
complex z plane generated by points z = x + ty where x = x(t) and y 


vil 
are continuous single valued functions of the real variable ¢. We shall assume 
that there is a unique tangent at each point of the arc and we shall denote 
the end points by E; and E». On describing A from £, to E» we can divide the 
neighbourhood of each point u on A (other than EF, and E,) into two areas, 
a left hand area and a right hand area with a small segment of A, containing 
u, as a boundary between the two areas. Certain functions h(v) possess the 
following property. Let u be a point on the arc A, other than one of the end 
points, then as v — u the function /(v) tends to a unique limit provided that 
the path to lies entirely either in the left hand area or in the right hanc 
area. If this is the case then the limit from the left hand area will be denoted 
by h*+(u) and that from the right hand area by /~(u). We exclude paths which 
approach u ultimately along the tangent to A. 

When A becomes a closed contour C then C is described so that its exterior 
containing the point at infinity, is the right hand area. 

Let F(v) be defined by the equation 


' l " 2) dz 
(25) F(v) = | KS) ¢ 
, (2 


P n+l + 
271. 


where the integration is taken along the arc A. 


THEOREM 4. /f (i) for all points of A, except possibly the end points, f(z 
possess derivatives up to order n, (ii) f(z) satisfies a Hélder condition on A 


\§ 


with 


orm 
22) 


lows 


aS a 
91) 


> use 


1 the 
y(t 
ume 
note 
> the 
reas, 
ning 
; the 
end 
that 
lane 
oted 


hich 


rior, 





\ GENERALIZATION OF THE CAUCHY PRINCIPAL VALUE 115 


(see inequality (7)) and u is a point on A other than one of the end points, then 
F+(u) and F-(u) both exist and 


+ l « I * f(z) dz 
oo ‘ = ? x . 
(29) re 2(nl)/ (#) + Sei’ J, (s—u)"”' 
_ ' i i= l a) f(z) dz 
(27) sion tems 2(n!)/ is) +501" , (g—)"° 
Proof. With v not a point on A integrate (25) m times by parts. We obtain 
: : l ” f"(z) dz 

2 es) = Gle _ 
(28) Pe) GO) + san J, (¢—v)’ 
where 

os 1 S(m-—i-1) f(a) f(b) 4 
29 7r\V = - , . y . a —s : — . 
(29) Gw) 241 ‘= n! l(a — v) (6—v)"'! 


The conditions assumed above ensure the truth of Theorems | and 2 for the 
case of integration along the arc A. Hence when v = u, where wu is a point on 
A, the integral on the right of (25) has a principal value. Again on taking the 
case when m = n in (14), multiplying by 1/(2r7) and subtracting from (28) 
we obtain 


, I ' z) dz 
(30) Fw”) —s—P | f(z) de 
emt Ja (2 — Ub) 
l ) f"(s) dz f f"(s) dz\ 
= G(v) — G( —_ _p 
G( C ©) T Fe) I, @—») Ja-w' 
Now let 
(31) h(v) = _- j } (2) dz 
Tia (g — v) 


Then if « is a point on A, since f(z) satisfies a Hélder condition, it is known 
that h*+(u) and h-(u) exist (2, §16, p. 37). Again as »v — u we have {G(v) — 
G(u)} —+ 0. Hence on making v — u through the left hand area it follows 
from (30) that F*(u) exists and that 


f(z) dz 


2 z— un)" 


++ : ] ‘ {"(z zt 
(32) Ft(u) -—-p | Pp | )¢ 
W1 Ja ( 1 J 4 


Qa (c— nu)!" 


it” 

T= nil" (u) — . 
From the first of the Plemelj formulae given by Muskhelishvili (2, p. 42 
(17.2)) we see that the right hand side of (32) is equal to ["(u)/2(n!) which 
establishes the truth of (26). 

Similarly, by using the second of the Plemelj formulae just cited we can 
establish the fact that F-(«) exists and also the truth of (27). 

When we place n = 0 in (26) and (27) they reduce to the Plemelj formulae 

The theorem still holds if A is a simple closed contour. It can be extended 
to the case when the path of integration is the real axis, from — © to o, if 
suitable conditions are imposed upon the derivatives of f(x) in order to make 
the integrals converge, for example conditions (iii) and (iv) of Theorem 2A 











116 CHARLES FOX 


We now see that F(v), as defined in (25) with »v complex, possesses the follow- 
ing properties: (i) if v is not on A it is an analytic function of », (ii) for large 
v it is O(v"—') and, (ili) the arc A is a line of discontinuity. In fact from (26 
and (27), when u is on A we have 


(33) F*(u) — F-(u) = > f'(u). 

(nm!) 
Acain, with u on A, F(x) is undefined, but if the conditions of Theorem 4 
are satisfied we may define F(u) to be equal to the principal value of the inte- 
gral on the right of (25). 

{: a is one of the end points of A and f(z) has a zero of order r at z =a 
then it is not difficult to see that for r > nm F(a) exists and that F+(a) = F-(a) 
= F(a). If r < m then in general F(v) has a singularity at v = a which is the 
sum of a logarithmic singularity and a pole. 


6. Two applications of (33). When m = 0, (33) reduces to a result 
which can be derived from the Cauchy principal value, a result which can be 
used to solve many important boundary problems in various branches of 
mathematical physics (2, chaps. 12 and 13; 3 pt. V). We now discuss briefly 
two such problems where we can use (33) in the more general case when 
n > O (nm an integer). 


Problem 1. Find a function F(v) which (i) is analytic at all points v except 
for points on the arc A, (ii) for large v is O(v~"—") and (iii) for given g(u), where 
u is on the arc A but is not one of its end points, we have 


(34) F+(u) — F-(u) = g(u). 
To obtain a formal solution we first solve the differential equation 
(35) f"(u) = (n!) g(u) 


for f(u) and then by an obvious substitution we can express (34) in the form 
(33). We then obtain as a formal solution of our problem 


(36) F(v) = | wok 
Zaid, ( 

If g(z) is an analytic function of z in a domain D which includes the arc A 
then there exists a solution of (35), f(z) say, which is also analytic in D 
Since f(z) then satisfies the conditions of Theorem 4 it will follow that F(v) 
is a solution of the problem. With a more prolonged discussion it is possible 
to show that a solution exists if g(z) satisfies a Hélder condition along the 


arc A. 

Problem 2. This is connected with the singular integral equation. For the 
case n = 0 the singular integral equation below has been studied in great 
detail by Muskhelishvili (2, chap. 6). For the general case n > 1 many new 


' 


pt 
ere 


rm 


A GENERALIZATION OF THE CAUCHY PRINCIPAL VALUE 117 


difficulties occur but in one case a solution can be obtained by means of a 
reduction to Problem 1. The equation in question is 
(37) g(u) f'(u) + 4) p | fe. = k(u), 
; 1 J, (3 — 0) 

where g(u), 4(u) and k(u) are given along the arc A and f(z) is to be determined. 

A formal solution is obtained by assuming that a function F(v) exists which 
is related to f(z) as in (25) and for which (26) and (27) both hold. On adding 
and subtracting (26) and (27) we obtain both f"(u) and the integral in (37) 
in terms of F+(u) and F-(u). After an obvious division (37) is then transformed 
to 
Jg(u) (m!) — h(u)t ) k (u) { 


38 r"(«) = ; 
(2) FP’ (u) \o(u) (n!) + h(u) ! (u) + \o(u) (mn!) + h(n)! 


This equation can be reduced to the same type of equation as is solved in 
Problem 1, namely (34) with functions transformed from F*(u), F-(u), 
g(u), h(u) and k(u) of (38) by means of known operations. F(v) can then be 
found and then, by using (33), f(z) can be found by integration. 

The most important part of this solution is the reduction of (38) to an 
equation of type (34), an equation which is solvable by means of the methods 
of Problem 1. This reduction does not depend upon the value of m and is 
therefore the same for the general value of m as for the case when n = 0. 
The details and the ingenious methods used by Muskhelishvili to effect this 
reduction when m = 0 can be found in (2, §47, p. 123). If the coefficient of 
F-(u) in (38) and the second term on the right hand side of (38) both satisfy 
Hélder conditions then the solution obtained by this method is valid. 


REFERENCES 


1. J. Hadamard, Lectures on Cauchy's Problem in Linear Partial Differential Equations (New 
York, 1952). 

2. N. I. Muskhelishvili, Singular Integral Equations, Translated from Russian into English 
by J. R. M. Radok (Groningen, 1953) 

3. , Some Basic Problems of the Mathematical Theory of Elasticity, Translated from 
Russian into English by J. R. M. Radok (Groningen, 1953). 

4. E. C. Titchmarsh, An Introduction to the Theory of Fourier Integrals (Oxford, 1937 


McGill University 











ON THE ZEROS OF THE FRESNEL INTEGRALS 
ERWIN KREYSZIG 


1. Introduction. This paper is concerned with the Fresnel integrals 


(1.1) C(u) = j cos(p’) dp, S(u) = | sin(p) dp 


0 
in the complex domain. 

Recent research work in different fields of physical and technical applica 
tions of mathematics shows that an increasing number of problems require a 
detailed knowledge of elementary and higher functions for complex values 
of the argument. The Fresnel integrals, introduced by A. J. Fresnel (1788 
1827) in connection with diffraction problems, are among these functions; 
a small collection of papers of the above-mentioned kind is included in the 
bibliography at the end of this paper (3; 5; 7; 12; 13; 17; 19; 20; 22). Moreover, 
the Fresnel integrals are important since various types of more complicated 
integra!s can be reduced (6) to analytic expressions involving C(u) and 
S(u). 

At the present time the detailed investigation of special functions for com- 
plex argument is still in its infancy. It has been limited, until now, to certain 
classes of functions, especially to those which have the advantage of possessing 
simple functional relations or of satisfying an ordinary differential equation; 
in the latter case the theory of differential equations can be used when con- 
sidering these functions. The Gamma function and the Bessel functions, 
respectively, are of this kind. The Fresnel integrals do not possess these 
advantageous properties and must therefore be treated by other methods 

The Fresnel integrals have been considered from different points of view 
(1; 2; 4; 8; 9; 14; 16; 18), but, until a short time ago, for a real argument only 
The first two investigations (10; 11) of these functions for complex values 
of the argument include some initial results about the zeros and also two 
small tables of function values. 

In this paper we shall prove some lemmas which yield a much more refined 
knowledge of the two integrals under consideration. Furthermore, we shall 
indicate relations to other known functions and develop new methods for 
investigating and computing the zeros of these integrals. We shall find large 
domains of the complex plane which cannot contain a zero of the Fresnel 
integrals. When determining the position of the zeros of a function it is always 
important to find (more or less accurate) approximate values for those zeros; 
then the computation up to the desired degree of accuracy can be done 


Received Mav 12, 1956 


lhis work was supported by the National Research Council of Canada 


118 


or 


THE ZEROS OF THE FRESNEL INTEGRALS 110 


schematically by means of the usual iterative methods. We shall see that in the 
case of the Fresnel integrals such approximate values (which are even of 
great accu’ wy) can be obtained in a simple manner. Also the more exact 
determinat n of the zeros will turn out to be relatively easy if appropriate 
representations of these functions are used. A table of the values of some 


zeros of the Fresnel integrals can be found at the end of the paper 


2. Fundamental relations, asymptotic behaviour. It is advantageous 
to transform the integrals (1.1) by means of the substitution p? = ¢. In this 
manner we obtain 

C(u) = 3 j t’costdt, S(u) = 4 | t’sin t dt 
J 9 J 0 
Since we will primarily investigate the zeros of those functions the factor } 
becomes unessential, and we will therefore omit it. We write 


(2.1) C(s) = | t’costdt, S(z) = t'*sin t dt 


where ¢ = x + iy denotes a complex variable. The representations (2.1) will 
be used in what follows. 

For finite values of |z| the Taylor series development of the integrands 
of (2.1) at z = 0 may be integrated term by term. We find 


. (—1)” on 
(2.2) Ciz) =2' >> i 


<0 (2m)! (2m + 4) 
(—1)” 2m+1 
S(z) = 2 > <2 ; 
=, (2m + 1)! (2m + 3) 


The functions z~*C(z) and z-?S(z) are entire transcendental functions. C(z) 


and S(z) have a branch point at z = 0; from (2.2) we obtain the relations 


(2.3) C(z e**) = gr" ig S(ze"") =e” * S(z). 


Furthermore, 
(2.4) C(z) = C(z), S(z) = S(s). 
Hence we may limit our considerations to the first quadrant (x > 0, y > 0) 
of the z-plane 

We now consider the asymptotic behaviour of the Fresnel integrals. In 
order that the limits 


(2.5) C = lim | t’costdt, S = lim | t’sin t dt 


exist, we must choose a path of integration which goes asymptotically parallel 
to the real axis (y = 0) to infinity. Then C and S have a uniquely determined 
finite value; transforming Euler's integral representation of the Gamma 


function in a suitable manner we find 


2.65 C=S=2°Pr(}) = V(x/2) = 1.253314 1 











120 ERWIN KREYSZIG 


Using such a path of integration, we have 


(2.7) C(z) + c(z) = C, S(z) + s(z) = S, 
where 
(2.8) c(s) = | t*cosidt, s(z) = | t'sin ¢ dt. 


Integrating (2.8) by parts and using (2.7), we obtain the following series 
representation of the Fresnel integrals: 


(a) C(z) ~ C + 27?(a(z) cos z + b(z) sin 2), 


(2.9) ; ' 
(b) S(s) ~ S + 27-?(—)(z) cos z + a(z) sin 2) 
where 
- mild... (4m — 3) mild... (4m — 1 
a(s) = >> (-1)" 35) ,b(z) = 1+ >> (-1) ——— 
m=1 eS m=1 “ae 
LEMMA 1. The series (2.9) are asymptotic expansions of C(z) and S(z), 


respectively, for all complex values of z with the exception of pure imaginary 
ones. 


Proof. Let us assume that we have integrated (2.7) by parts a number ol 
times so that the integrated term finally obtained involves the power z 


Then, when neglecting some numerical factor which does not interest us, the 
remaining integral is of the form 

t'"'*’cosidt or | "sin ¢ dt. 

e e 


We thus have to estimate the integrals 


K, = | rey K, = | pete. 


Setting / = s + iw and ¢ = zs — iw, respectively, we obtain 
. 
K, =te2”"” | (1 + tw/z) "ce “dw, 
ei 
to 
, . iz (m+4) m+) 
K,= —te 2 ”™ | (1 — tw/z) e “dw. 
e 0 
If gis a not a pure imaginary number, i.e. |jarg z| < 34 — yor \arg2| > 44 + 7, 


y > 0, the inequality 


holds, and therefore 


+ 


ter 
| (1 + tw/z) at), “dw\ < (cosec 7)” 
ef 


Hence 


1c 


THE ZEROS OF THE FRESNEL INTEGRALS 121 


The constant involved in the Landau symbol does not depend on arg z but 
depends on y and tends to infinity if y tends to zero. 

A value of z being given, the greatest possible accuracy is obtained if the 
number of terms of (2.9) is chosen so that the last term corresponds to the 
highest value of m for which 


(2.10 m < ${(\2|* + 4)? + 1] 


still holds, as can easily be seen. 


3. Relations to other known functions. The relations of the Fresnel 
integrals to the incomplete gamma functions 


(3.1) (a) P(¢,2) = | e ‘dt, (b) Q(¢, 2) = j e ‘dt = 1'(¢) — P(¢,2) 


are of basic importance. Setting @ = 4 and substituting ¢ = iw and? = iw, 
respectively, we obtain from (3.la) 
~ i i 
39 C(z) = $[¢’P(4, iz) + #@P(4, —iz)], 
\e 
4 ‘ al i 
S(z) = $[?P(4, iz) + 7° P(4, —iz)] 


When computing a table of a special function the situation is very often as 
follows: For small arguments the Taylor series development at z = 0 can be 
used and for large values of |z| the asymptotic expansion permits a simple 
calculation. The remaining difficulty consists in determining function values 
for arguments which are not very close to z = 0 but are too small to be 
calculated exactly enough by means of the asymptotic expansion. With respect 
to the Fresnel integrals we are just in such a situation, but we can overcome 
the difficulty by using the Nielsen representation (16, p. 84) 


, — > ~ 
(3.3) Q(¢,2+h) = O(¢,2) —e $f -n{"~ ins! by 


nl o 
m=) oar 
Setting @ = 3, ¢ = tw, and ¢t = — tw, respectively we obtain from (3.1b), 
(34 c(z) = $(iQ(4, iz) + FOC}, —iz)), 
». } : 
s(z) = ${Q(4, iz) + i40(4, —iz)], 


and from this and (3.3), 
(3.5 c(z +h) = c(z) + 12-' 2-3 (P,(z) — P2(z)), 
9.0) i 

s(g +h) = s(s) — 2-' 2-?(P,(2) + P2(z2)) 
where 


' _ sion 
(3.6) Pi(z) =e ay -(™ iV! (m +1, th) . 


~ 
“ 


‘co —. — 4\P 1, —th) 
P,(z) =e > i (; ) eT, = . 


ni 











~ 
~ 


és ERWIN KREYSZIG 


By means of (2.8) and (3.5) we are able to calculate the first differences of the 
function values of C(z) and S(z). Starting then from function values which 
can be simply obtained by using (2.2) or (2.9) we can immediately compute 
the desired function values of the Fresnel integrals. For the above-mentioned 
‘“‘medium”’ values of the argument this procedure is much better than a direct 
calculation by means of (2.2). From (3.1) we have 


(3.7) (a) POo,hk) =1—e", (b) P(m+1,h) = | e ‘tdt. 


Starting from (3.7a) and using the recurrence relation 
P(m + 1,4) = mP(m,h) — &€“h", 


the functions P(m + 1, ih) and P(m + 1, —ih) occurring in (3.6) can be 
easily calculated. It is advantageous, of course, to choose a fixed value of / 
for a certain computation. 
For the sake of completeness we mention also the following relations: using 
the integral representation (15, p. 87) 
r(c) 


F(a, c; 2) : Zz 
ae: l(a) (ce — a) 


of Kummer's function ,F,(a,c:2), setting a = }, c= and ¢ = + iw, 


respectively, we obtain 


(3.8) C(z) = Vel. F1(4, 5; —iz) + 1 Fi (4, 5 ; #2)] 
S(z) = ivf2[,Fi(4, 3; —iz) — 1F1(4, 3; 12). 


is known tha e Fresnel integrals are also related to the error function 
It is known that the | | integral l lated to ¢! f t 


” 
o(z) = je dt; 


substituting / = ./(+iw) we have 
| 
C(z) = 4x[t*6(V —iz) + 1-*6(V/12)] 
(3.9) . 
S(z) = $x[1-76(V/ —iz) + 1°6(v/i2)] 


4. Domains which cannot contain zeros. Let us first give a simple proo! 
of the fact that all zeros z( #0) of the Fresnel integrals must be complex. 


LEMMA 2. The Fresnel integrals do not vanish for any real or purely imaginary 
argument different from zero. All zeros of these functions are simple and conjugate 
complex to each other in pairs. 


Proof. In consequence of (2.3) we may consider positive values of x only 


From the form of the integrand of (2.1) it follows that 


(a) C((4n + 1)x/2) — C((4n — 1)r/2) > O, n = i, 2, 
a of 
1 C((4n + 3)x/2) — C((4n + 1)x/2) < O, | 0.1 
i n=0Q,1, 
(A S((2n + 1)x) — S(2nx) > 0, f 
) 


S(2nx) S((2n l)r) < 0, n iL, 2, 





7= 


THE ZEROS OF THE FRESNEL INTEGRALS 1255 


. i . 
Since x7? 1s monotone, 


(a) |C((2m + 3)x/2) — C((2n + 1)r/2)| < |\C((2n + 1)r/2 
(4.2) C((2n l)r/2 


(b) |S((n + 1)x) S(nxr)| < |S(nw) — S((n l)xr), m= 1,2, 


From (4.1b), (4.2b), and S(O) = O it follows that S(x) # 0 for any real 
value of x # 0. In order to draw the same conclusion with respect to C(a 
from (4.la) and (4.2a) we have to prove that C(3r/2) > 0. Using (2.6), 
(2.8) and integrating by parts, we find 


C(8x/2) = V(9r/2) — V (2/34) +3 t cos t dt 


Since ¢~*/? is monotone, 


e(2n+1)9r/2 e(2n+3)9 
| t "cos t dt| > t "cos tdt| , n= 1, 2, 


(2n—1) /2 (2n+1)9 /2 


Consequently 


t "cos tdt > 0. 


Hence C(3r/2) > 0. This completes the proof that C(z) cannot vanish for 
real values of x (#0). If ¢ is a pure imaginary number all terms in (2.2) have 
the same sign; hence there cannot exist a pure imaginary zero of the Fresnel 
integrals. The existence of zeros follows from the fact that 2'C(s) and 2)S(z 
are entire functions which are not of the kind of an exponential function 
Since the integrands of the Fresnel integral have real zeros only the zeros of 
the integrals are simple. Since (2.2) has real coefficients the zeros of the Fresnel 
integrals are conjugate complex in pairs. This completes the proof of Lemma 2 

We now consider the possibility of limiting the zeros of C(z) and S(z) to 
certain domains of the complex plane 


THEOREM 3. The Fresnel integral C(z) cannot possess seros im any oj the 
strips which are parallel to the y-axis and correspond to the values 


S< ss <¢ fe, (4n — l)x/2 Kx < (2n + I1)z, n 4 


The same is true for S(z) with respect to the strips parallel to the y-axis and 
corresponding to the values 


0 <x < 3x/2, Zune <x < (4n + 3)x/2, n 2 
Proof. Because of (2.3) and (2.4) we may consider the first quadrant 


(vx < 0, y < 0) of the z-plane only. In order to prove the first of the two 
statements we start from the integral 


J = C(x + ty) — C(x) = | t*cos t di 











124 ERWIN KREYSZIG 


Setting ¢ = x + iw and (x + iw)-) = a + ib we obtain 


*y ‘Vy 

R J = sinx | a sinh w dw — cosx b cosh w dw, 
0 0 
‘vy °y 

¥Y J = sin x | b sinh w dw + cos x J a cosh w dw. 
0 0 


Since x > 0 and y > 0 we have a > O and b < 0 for all values of z under 
consideration. C(x) is real and not negative, cf. Lemma 2. We thus obtain 


RJ> 0, RC(z) > 0 (Que <x < (4n + 1)2/2), 
and 
SJ <@ ((4n + 1)r/2 < 
JJ>O0 ((4n + 3)x/2 < x 


From this the statement on C(z) follows. The second part of Theorem 3 can 
be proved in a similar manner. 

It should be noticed that the idea of the proof of Theorem 3 can be applied 
to more general integrals of the type 


(4.3) C(s,a) = t “cos t dt, O<a<l 
7/0 

in order to obtain the same result on the zeros. Also the integrals 

(4.4) S(z,a) = | t “sin ¢ di, 0O<a<! 
7 


may be considered in this manner, but the method is not applicable to inte- 
grals (4.4) having values of a between | and 2 (exclusively), since in this case 
R te > O and § tr < O may not hold. Indeed, for sufficiently large values 
of a (<2), S(z, a) has zeros also outside of the strips defined by 


(4n — 1)x/2 < x < Qn. 


5. Formulas of approximation for the zeros. From Theorem 3 we can 
draw the important conclusion that all zeros of C(z) and S(z) are at a sufficiently 
large distance from the origin z = 0. This fact enables us to use the asymptotic 
expansion (2.9) for a more detailed investigation of those zeros. 

As was pointed out in the introduction, it is always important to have 
approximation formulas for the position of the zeros of a function, since 
approximate values can yield the starting point for applying one of the usual 
iterative methods for a more accurate determination of those zeros. We will 
now derive simple approximation formulas for the zeros of C(z) and S(z). 

In consequence of (2.9) the zeros of the equation 


(5.1) sinz = — Cv/z 


are first approximations of the zeros of C(z). Setting /z = p + ig and using 
(2.6), we obtain from (5.1) 


(5.2) pb = — V(2/m) sin x cosh y, g = — V(2/x) cos x sinh y. 


i 





THE ZEROS OF THE FRESNEL INTEGRALS 125 


We consider the strip S,: (2n — 2)" < x < 2nx, y > 0, which is parallel to 
the y-axis. In consequence of Theorem 3, only the part S,’ C S,, defined by 


(2n — 1)w < x < (4n — 1)x/2, y > 0 can contain a complex zero of C(z). 
Setting 
H(z,a) = j f “cos t dt, k>a>O0 
7/0 


we have H(z,0) = sinz and H(z, 3) = C(z). H(z,0) has real zeros and 
H(z, 4) has complex zeros only. Hence, if a decreases monotonely from } to 0 
then, for a certain value ay = ao(m), 4 > ay > 0 we must have a real zero 
Ze, Of H(z, a) in S, a first time. Since, for all values of a, H(z, a) has a minimum 
at x = (4n — 1)x/2 the zero z,, must coincide with that point. Hence, when 
denoting by x, the real part of the zero of C(z) in S,’ and setting 


(5.3) x, = (4n — 1)x/2 — y,, 
because of continuity y, must be a small (positive) quantity. We have 
3 . 2 

COS X, = —¥n + O(ya), sinx, = —1 + 0(7,), 
where the functions indicated by the Landau symbol are small of higher order. 
The absolute values of the zeros of C(z)—even that of the smallest one —are 
relatively large, cf. Theorem 3. Hence the same is true for the corresponding 
quantities |p|. Setting 

cosh y = $e’ + O(e"), sinh y = $e” + O(e’), 
the second term is thus small in comparison with the first one. Omitting the 
functions indicated by the Landau symbols and using 


9 
2 “/.:2 2 2. + a8 
—g = -—(sin'x cosh’y — cos x sinh’y), 
T 


cf. (5.2), we obtain the following approximate expression yo, for the imaginary 
part y, of the zero of C(z) which is located in S,: 


(5.5) Yon = log (r+/(4n — 1)), n 


" 
— 
to 


From (5.3)—(5.5) and 
y = 2pq = . sin 2x sinh 2y, 


cf. (5.2), we obtain a similar approximation xo, for the real part x 


, of the zero 
of C(z) which is located in S,: 


_ (4n—1) _ log (4+/ (4n — 1)) 


(5.6) Xo ae | ee ; n= 1,2, 
: 2 (4n — 1)x 
The degree of accuracy of these simple formulas (5.5) and (5.6) is relatively 
high. For the real and imaginary part of the smallest zero of C(z) the error 
amounts to 2 per cent and 1 per cent, respectively. The accuracy increases 
rapidly with increasing values of |z|, as can easily be proved. 











126 ERWIN KREYSZIG 


Comparing (5.5) and (5.6) we find 


LEMMA 4. The imaginary part y, of the nth zero of C(z) increases monotonely 
with n. We have 


(5.7) lim Yu/X_» = 0. 


nox 


The difference Yn between x, and (4n — 1)x/2 decreases monotonely with 
increasing n and tends te zero if n tends to infinity. 


Using (5.1), the third term of (2.9) takes the form 


4/2 ‘ -_ 
42 cos z = 1(22) Vl¢r—-s ) 


which becomes arbitrarily small if |z| increases arbitrarily. That is, if the 
desired accuracy is not too great one may restrict oneself to the first approxi 
mation of the zeros. 


In consequence of (2.9) the zeros of the equation 


(5.8) cos z = Sz 


are the first approximation of the zeros z,* = x,* + iy,* of S(z). In a manner 


similar to the preceding one we obtain from (5.8 

(5.9) Tan = log (2rv/n), p@ £28. «: 
and 

: * ; log (224+/n) 

(5.10) Yon = 2ne — , ek oe ee 

in 
These formulas are also of a relatively high degree of accuracy. For the smallest 
zero of S(z) the error is about 5 per cent for the imaginary part and | per cent 
for the real part. The error is much smaller for the larger zeros, e.g. about 
1 per cent for the imaginary part and 0.3 per cent for the real part of the 
second zero, etc. 
From (5.9) and (5.10) we find 


a 


. * * 
(5.11) lim y, /X, = 0. 


n 


6. More exact determination of the zeros. I|n the preceding section 
first approximations of the zeros of the Fresnel integrals C(z) and S(z) were 
obtained from (2.9). As follows from Theorem 3 and (2.10) the expansion 
(2.9) permits also a more exact determination of those zeros. In the case ol 
the smallest zero (2.10) yields m = 2, i.e. the greatest possible accuracy is 
obtained if we take the constant term and the next 4 terms of (2.9) and deter- 
mine the smallest zero of the function thus obtained. In the case of the second 
zero we have from (2.10) m = 6, i.e. we have to take the constant term and 
the next 12 terms of (2.9), etc. However, even the simplest of the equations 
which we obtain in this manner is too complicated and cannot be solved 
immediately. But there is another way which will turn out to be very simple 


THE ZEROS OF THE FRESNEL INTEGRALS 26 

Let us first consider the Fresnel integral C(z). We start from the values 
obtained from (5.5) and (5.6) and improve those values by applying the 
Newton method. The values z,, thus obtained from 2), are more accurate 
approximations of the zeros of (5.1). We now apply the Newton method 
several times and, from step to step, we always take into account one more 
term of (2.9a). Let us denote by /,(z) the function which is obtained by taking 
the constant term and the next p terms of (2.9a). The zero of the equation 
(,(z) = 0 which is contained in the strip S,’ (cf. §5) will be denoted by z, 


The derivative of f,(z) is given by the simple expression 


(6.1) fi(z) = 2 *(cosz + h,(z)), p=1,2 


where h,(z) is of the form k,z~ sin z or kez” cos 2, Rk; and ke denoting certain 
constants; all other terms drop out in pairs; 4,(z) is small in comparison with 
cos z. If z,_;,, denotes the zero of the equation f,_,(z) = 0 in S,’ then /,(2,—:, 
consists of one term only, namely, of the last term of (2.9a) under considera- 


) 


tion. The function tan z,,,, occurring as a factor in some of the Newton quot 
ients f/f’, may be replaced by 7; this simplification is the same as that in the 
preceding section where we omitted the functions indicated by Landau 
symbols. 

In the case of the Fresnel integral S(z) the reasoning is exactly the same as 
in the case of C(z). 

The procedure yields a finite sequence 2; ,,, Z2,,, . . . of approximative values 
of the zero z, of C(z). The terms of this sequence are recursively determined 
by the following simple relation: 


(6.2) Ssta @ Son + Clea: p = 1,2, 
where 
Coit = (—1)°1.3... (4g + 1), g =0,1,.. 
Cog = (—1)8' 1.3... (4g — 1), g = 1,2,. 
For S(z) we similarly find 
(6.3) Seone @ Sno + Cone)”. p=1,2,..., 


where the constants c, are the same as in the preceding formula. Of course 
the numerical values of the different correction terms are entirely different 
in both cases, since z,, differs from the corresponding approximate value 
Z»»*. For every fixed value of » the corresponding correction term decreases 
monotonely with increasing m. Since, for fixed m and p, |z,.*| > |Z»,,|, the 
absolute value of the correction term c,(2z,,*)~” is smaller than that of 
c,(2z,,)~”, but greater than that of c,(22) .»41)7”. 


7. Further properties of the zeros. From the preceding results we can 
draw some conclusions which might be of interest. Let us compare the zeros of 
C(z) with those of S(z). From (5.5), (5.6), (5.9), and (5.10) we find that not 
* 


only the sequences (Yon), (¥on™), (Yn), (¥n*), where y,* = (4nx)—'log (244/n 











128 ERWIN KREYSZIG 


are monotone, but also the sequences yo.1, Yo.1*, Yo.2, Yo.2*,-.. and yi, 1", 
v2, v2", .... We thus obtain 


THEOREM 5. The zeros of C(z) and S(z) are (asymptotically) located on one 
and the same logarithmic curve, in alternating order; this curve can be represented 
in the form 
(7.1) y = + 4 log 2xx. 


In consequence of (5.3), (5.7) and (5.11), the series 


—« 1 : *)-—1 
(7.2) _ Se} > Zn 
1 


n n=l 


are minorants of the harmonic series. Hence the series 


(7.3) er. - re", «> 0, 


a=1 n=l 
converge, but the series (7.2) diverge. The functions z'C(z) and 2'S(z) thus 
are entire functions of first order of divergence class. 
According to the order of 24C(z) and 2'S(z) the exponent of the exponential 
function contained in the Weierstrass product of these functions can at most 
be a linear function of z. It can be proved that, in our case, this function is 


actually a constant. Since 


. 1/2 ‘ , —3/2 2 
lim z 'C(s) = 2, lim s S(s) = ;, 


z}~0 »0 


the Weierstrass products of the Fresnel integrals have the form 


(7.4) C(z) = 22'"T] ( _ = )( ~ “) , 
n=l “on “n 


8. Distribution of the function values of the Fresnel integrals. Tables 
of the function values of the Fresnel integrals for complex values of the argu- 
ment have been communicated in (10; 11). Both functions have a similar 
behaviour which can be most simply described by characterising the geometric 
form of the surfaces F(C) and F(S) of the absolute value of C(z) and S(z) 

In consequence of the maximum principle, the real extreme values of C(z 
and S(z) correspond to saddle points of F(C) and F(S). Because of (2.4), 
both surfaces are symmetric with respect to the x-axis. Any zero is a singular 
point of the surfaces which, in a small neighbourhood of such a point, behave 
like a circular cone Z; the angle between the generators of Z and the xy-plane 
is about 32, as can be seen from the Taylor series development of C(z) and 
S(z) at a zero under consideration. Any other point which corresponds to a 
finite value of z is a regular point of the surfaces. It can be seen from (2.9) 
that the surfaces ascend rapidly for large y. The tangent of the angle a(z) 








ed 


—— 





———eorrr 
SE _ - 





THE ZEROS OF THE FRESNEL INTEGRALS 129 














40 


“xely 
SizjeRe? 


between the direction of maximum slope and the xy-plane is asymptoticall 
given by the expression 


(8.1) tan a(z) = 4/z|"’e’. 

Along the lines y = const., the maximal slope decreases with increasing 2 
There exist real points of inflection of the curves C(x) and S(x) at x=nx—6(n 
and x = (2m + 1)x/2 — 6*(n), respectively, where 6(m) and 6*(n) are positive 


and monotonely decreasing functions of n. All these points are isolated 
parabolic points of the surfaces F(C) and F(.S). The surfaces consist of domains 
of positive and negative Gaussian curvature which are bounded and separated 
from each other by curves whose points are parabolic (“parabolic curves"’ 
These curves can be obtained from 


8.2) Ri f2/f"f) — 1 = 0, 











130 ERWIN KREYSZIG 


(cf. 21), where f = C(z) or S(z), respectively. Through any zero there passes 
exactly one of those curves; the curves remain always in a neighbourhood 
of the curves of constant phase #/2 and 3/2 with which they asymptotically 
coincide. 


9. Tables. The methods developed in this paper enable us to investigate 
and to calculate the zeros of the Fresnel integrals in a simple manner. 

We did not consider the functions c(z) and s(z) defined by (2.7). Although 
these functions are very simply related to the Fresnel integrals, their behaviour 
is different from that of C(z) and S(z). Since c(z) and s(z) also occur in connec- 
tion with many practical problems they should eventually be studied more j 
detail; this will be done in a subsequent paper 


ZerOS Zn = X_ + syn Of C(z) 


"% | Xn Ya n Xn Ya n Xn Yn n Le Va n Xn y 


1| 462 1-68 | 11 | 67-53 3-03 | 21 | 130-36 
2 /10-94 2-11 12 73°81 3-07 | 22 | 136-65 38 | 32 | 199-48 3- 42 | 262-32 3-70 
3 40 | 33 | 205-77 3-58 | 43 | 268-60 3-72 
4 42 | 34 | 212-05 3-60 | 44 | 274-88 3-73 
5 44 | 35 | 218-33 3-61 | 45 | 281-17 3-74 


‘35 | 31 | 193-20 3-55 | 41 | 256-03 3-69 
57 


17-24 2-34 | 13 | 80-10 3-11 | 23 | 142-93 
23-53 2-50 | 14 | 86-38 3-15 | 24 | 149-21 
5 | 29-82 2-62 | 15 | 92-66 3 


wa oO Ww & & 


“18 | 25 | 155-50 


6 | 36-11 2-71 | 16 | 98-95 3-22 | 26 | 161-78 3-46 | 36 | 224-62 3-63 | 46 | 287-45 3-75 

7 | 42-40 2-79 | 17 | 105-23 3-25 | 27 | 168-06 3-48 | 37 | 230-90 3-64 | 47 | 293-73 3-76 

8 | 48-68 2-86 18 | 111-51 3-28 | 28 | 174-35 3-50 | 38 | 237-18 3-65 | 48 | 300-02 3-77 

9 | 54-96 2-92 | 19 | 117-80 3-30 | 29 | 180-63 3-52 | 39 | 243-47 3-67 | 49 | 306-30 3-78 

10 | 61-25 2-98 | 20 | 124-08 3-33 | 30 | 186-92 3-53 | 40 | 249-75 3-68 | 50 | 312-58 3-79 
Zeros Z,* = X,* + iyn* of S(z) 

"y 1 = yn* | ei &* Yn" n f° 7° si «a Yn n x,* y,* 


1 | 620 1:74] 11 | 69-09 1 | 131-93 


3°36 | 31 | 194-77 3-55 | 41 | 257-60 3- 


04) 2 3 3-69 
2 /\12-51 2-16 | 12) 75-38 3-08 | 22 | 138-22 3-38 | 32 | 201-05 3-57 | 42 | 263-89 3-71 
3 |18-81 2-37 13 81-66 3-12 | 23 | 144-50 3-41 | 33 | 207-34 3-59 43 | 270-17 3-72 
$ | 25-10 2-52 | 14 87-95 3-16 | 24 | 150-79 3-43 | 34 | 213-62 3-60 | 44 | 276-45 3-73 
5 131-38 2-63 | 15 | 94-23 3-19 | 25 | 157-07 3-45'| 35 | 219-90 3-62 | 45 | 282-74 3-74 


6 | 37-67 2-72 | 16 | 100-52 3-23 26 163-35 3-47 | 36 | 226-19 3-63 | 46 | 289-02 3-7: 
7 |43-95 2-80 | 17 | 106-80 3-25 | 27 | 169-64 3-49 | 37 | 232-47 3-64 | 47 | 295-30 3- 
8 | 50-24 2-87 18 | 113-08 3-28 | 28 | 175-92 3-50 | 38 | 238-75 3-66 | 48 | 301-59 : 
9 | 56-52 2-93 | 19 | 119-37 3-31 | 29 | 182-20 3-52 | 39 | 245-04 3-67 | 49 | 307-87 

10 | 62-81 2-99 | 20 | 125-65 3-34 | 30 | 188-49 3-54 | 40 | 251-32 3-68 | 50 | 314-15 


aoAewn & 
or 
~I 





| 





ws oa un a 


2! 


22 


Oj 


eS 


xd 








1. 


afc WS 


. W. Buerck und H. Lichte, Untersuchungen tiber die Laufseit in Vierpoien und div 


THE ZEROS OF THE FRESNEL INTEGRALS 11 


REFERENCES 


\. Bloch, Sur les intégrales de Fresnel, Darboux Bull. (2), 46 (1922), 34-35 
M. Brelot, Quelques propriétés des fonctions de Gilbert et de la spirale de Cornu, Bull. Sci 
Math. (2), 61 (1937), 133-160. 


wendbarkeit der Gleitfrequensmethode, Elektr. Nachr. Technik, 14 (1938), 78-101 
M. F. Egan, Gamma functions and Fresnel's integrals, Math. Gazette, 19 (1935), 366-367 


. F. Emde, Kurvenlineale, Instrum. Kunde, 58 (1938), 409-413 
. B. de Haan, Nouvelles tables d'intégrales définies (Leyden, 1867 


\. L. Higgins, The Transition Spiral and its Introduction to Railway Curves (London 
1921). 


. P. Humbert, Sur les intégrales de Fresnel, Bull. Soc. Sc. Cluj, 7 (1934), 530-534. 

. V. Jamet, Sur les intégrales de Fresnel, Nouv. Ann. (4), 3 (1903), 357-359 

. E. Kreyszig, Ueber den aligemeinen Integralsinus Si(s, a), Acta Math., 85 (1951), 117-181 
. E. Kreyszig, Der allgememne Integralkosinus Ci(>, a), Acta Math., 89 (1953), 107-131 


Ih. Laible, Héhenkarte des Fehlerintegrals, Z. angew. Math. Physik, 2 (1951), 484-486 
1). N. Lehner, Cornu's spiral as a transition curve, Cal. Inst. Tech., 3 (1904), 71-82 
\. Lindstedt, Zur Theorie der Fresnelschen Integrale, Ann. Physik (2), 17 (1882), 720-725 


. W. Magnus and F. Oberhettinger, Formulas and Theorems for the Speciai Functions of 


Mathematicait Physics (New York, 1949). 
N. Nielsen, Theorie des Integrallogarithmus und verwandter Transsendenter (Leipzig, 1907 


. L. Oerley, Ubergangsbogen bei Strassenkriimmungen (Berlin, 1937) 


N. A. Oumoff, Interprétation géométrique des intégrales de Fresnel, |. de Phys. (3), 6 (1897), 
281-289. 
lh. Poeschl, Das Anlaufen eines einfachen Schwingers, Ing. Archiv, 4 (1933), 98-102 


. H. Salinger, Zur Theorie der Frequencanalyse mittels Suchtons, Elektr. Nachr. Vechnik, 6 


(1929), 293-302. 


. E. Ullrich, Betragfldchen mu ausgezeichnetem Kriimmungsverhalten, Math. Z., 54 (1951), 


297-328. 

|. Zbornik, Asymptotische Entwicklungen fiir Fresnelsche Integrale und verwandte Funktionen 
und ihre Anwendungsmdglichkeiten bei der Berechnung specieller Raketenbahnen, 7 
angew. Math. Physik, 4 (1954), 345-351 


University of Ottawa and 
Ohio State University 











A MINIMUM-MAXIMUM PROBLEM FOR 
DIFFERENTIAL EXPRESSIONS 


D. S. CARTER 


1. Introduction. In the study of approximate methods for solving ordinary) 
differential equations, an interesting question arises. To state it roughly for a 
single first order expression, let yo(/) be the solution of the equation 


(1.1) f(t, v,¥) = 0 


which satisfies the initial condition y(a) = m. Let m be an approximation 
to the value of yo at a later time, ¢ = b. Unless this approximation is exact, 
there is no continuous function which satisfies (1.1), together with the two 
boundary conditions 


(1.2) yv(a) = mq, y(b) = m. 


The question is whether there exists a continuous function satisfying (1.2), 
for which the maximum absolute value of f(t, y, ¥) on the interval [a, 5] is 
minimized; and if so, how to find it. 

Stated for higher order and multiple-component systems, this problem 
should find application in engineering and some branches of “operations 
analysis."’ For example, in a dynamical system being driven between two 
preassigned points in phase-space, it may be of interest to minimize the peak 
value of a stress which is expressible, through accelerations or friction, as the 
absolute value of a differential form. 

A moment’s reflection provides some insight into the nature of the solution 
If there is a solution y*(t), then f(t, y*, #*) must be constant in absolute 
value, so that |f| is everywhere equal to its supremum. Otherwise it would be 
possible to decrease |f| near its maxima at the expense of increases elsewhere 
This does not mean that f itself need be constant. In fact, it turns out that the 
value of f at the solution will generally have a number of discontinuities of 
sign, especially for higher order expressions. 

For the first order system (1.1), (1.2), this situation is illustrated by the 
following rough variational argument. Let y(t) satisfy (1.2), and let the partial 
derivative f,(t, y, ¥) be constant in sign throughout [a, 5]. The first variation 
of f corresponding to a variation éy is 


Received January 6, 1956. Work performed under the auspices of the U.S. Atomic Energy 
Commission. 

The author wishes to express his indebtedness to J. Lehner and S. Ulam of the Los Alamos 
Scientific Laboratory for helpful discussions of this problem 


132 


r\ 


vO 


rey 


Os 





MINIMUM-MAXIMUM FOR DIFFERENTIAL EXPRESSIONS 133 


Solving for dy, and using the fact that dy = 0 at ¢ = a, b, we have 


ds|6f(s) fs(s)] exp) 


a 


{ 
dr f,(r) ire = 0, 


which is a necessary and sufficient condition for the admissibility of 4f. Since 
neither f, nor the exponential factor changes sign, this requires little more 
than that 6f shall change sign. Unless f is constant throughout [a, 5] it is easy 
to construct an admissible 4f differing in sign everywhere from f, so 
sup|f + 6f| < sup|f|. Conversely, if f is everywhere constant, then é6f must 
agree in sign with f on part of the interval; and sup|f + 4f| > sup|/| for every 
non-vanishing dy. That is to say, sup|f| has at least a local minimum at y = y* 
if and only if y* satisfies (1.2) and 


lor some constant c. Allowing for variation of c, this equation has a two-fold 
infinity of solutions; and in the simplest cases this is just enough to yield a 
unique solution satisfying (1.2) 


lhe assumption that f, be constant in sign may be relaxed simply by taking 


where the sign agrees with that of / 

rhe object of this paper is to present a complete theory for linear systems 
alone, of which the chief results are contained in Theorem 1, §5, and Theorem 2, 
§6. Detailed consideration of non-linear cases is left for later publication. 
Meanwhile, rough extensions of the present theory by variational arguments 
i.e., linearization near the solution) will undoubtedly yield correct results 
for most applications 


2. Formulation of the first order problem. Al! functions encountered 
are real, single-valued, and defined on a closed, bounded interval J = fa, b} 

By a “vector” we mean an ordered n-tuple of functions, f = (f*), for some 
hxed m. Similarly, a “‘matrix’’ is an mxn square matrix of functions. A vector 
or matrix is said to be continuous, or summable, etc., if and only if each of its 
components has that property. For matrix operations, vectors are regarded 
as rows or columns according to context. 

Two spaces are fundamental to the discussion. The trial solutions y(/) are 
taken from the space Y of absolutely continuous vectors; the values of the 
differential expressions lie in the space M of measurable vectors. Elements 
of M are regarded as equivalent, written f ~ g, when they are equal almost 
everywhere. 

For all vectors we define the ‘vector absolute value” 


2.1 fi = |(f*)| = Cs"), 


‘ 


ind for elements of M we have the ‘“maximum-norm”’ 











134 D. S. CARTER 


(2.2) J\| = sup fess. sup./f*| |¢ = 1, ,n}, 
where the essential supremum of each component is taken over / 

Given an ordered n-tuple J = (J*) of subsets of J, the vector whose com- 
ponents are the characteristic functions of the corresponding sets J‘ is called 
the characteristic vector of J. 

To define the problem we are given 

(a) a first order differential operator 


(2.3) Yy = Alv + By +c] 


which serves to map Y into M. A is an almost everywhere finite and non 
singular matrix, whose inverse A~', together with the matrix B and vector ¢, 
are Lebesgue summable. 

(b) an ordered n-tuple J of measurable subsets of /, at least one of whose 
components has positive measure. The components of {y will be free to vary 
on the corresponding component sets J‘, but must vanish almost everywhere 
on the complements J — J‘. 


(c) a pair of initial conditions for vectors y © Y, one for each endpoint of / 
(2.4) y(a) = nq, yb) = m 


Let FC M be the space of equivalence classes of essentially bounded 
vectors {, whose components f* vanish almost everywhere on the corresponding 
sets / — J'. Let X C Y be the set of all absolutely continuous vectors 
which satisfy the initial conditions (2.4), and such that &x F. Then a vector 
v» is said to be a solution of the minimum-maximum problem if and only il 
vo € X and 


(2.5) Le inf f ijl) ja X | 


It is important to notice that a given problem may be inconsistent, in the 
sense that the set Y is empty. (For example, n > 1, vy = ¥. J' is nul, and 
n', ~ n',. Here *' ~ O for x X, which contradicts the requirement that 
v'(a) # x'(b).) However, we will find that every consistent problem has a 


solution 


3. The set G = YX. Although the solutions are defined in terms of Y, 
it is more convenient to work with the image G of X under &. This is possible 
because of 


LEMMA 1. The sets X and G are in 1: | correspondence, for ¥ has a uniqu 
inverse on G. 


The proof consists of the observation that for every / F the differential 
equation vy ~ f has a unique solution satisfying either of the two initial 
conditions (2.4).! 


'The properties of solutions of all the differential equations considered in this section ar 
essentially the same as if the coefficients were continuous. They mav be derived bv a slight 


extension of the standard. methods (1; 2, chap. IX) 


i) 


MINIMUM-MAXIMUM FOR DIFFERENTIAL EXPRESSIONS 10 


In view of this correspondence, the solutions may be defined in terms of G: 
gy is a solution if and only if go © G and 


gol| = inf fiigi| ig G}. 
rhe remainder of this section is devoted to a derivation of the necessary 
and sufficient conditions stated in Lemma 2 for an element of F to belong to 
G. First we will find expressions for the two inverses of ¥ corresponding to the 
two initial conditions at a and b. The required conditions follow from the fact 
that these inverses are equal on G. 
Let W be the space of solutions of the homogeneous equation 
y+ By ~ 0, 
and let Z be the space of solutions of the adjoint equation 
v — yB ~ 0. 
IW and Z are both n-dimensional subspaces of Y. For every pair w W and 
Z the “scalar product” 


sw = oe z(t)w Ul 


l 


is constant, independent of /. 
If m vectors {¢,;} form a basis for W, they may be combined into a matrix 


I which is everywhere non-singular. The matrix 
A(t, s) = E(t) E-(s 


is independent of the choice of basis, and plays the role of “‘translation 


operator” in W and Z 
(3.1 w(t) = A(t, s)w(s), 2(s) = 2(t) A(t,s 
For every | F, the pair y,, y» of solutions of 
Yy ~ / 


which satisfy the initial conditions (2.4) at a and 6, respectively, are given by 


(3.2a) Ya(t) = Wal(t)+ | K (t, s)|A~"(s) f(s) — e(s)] ds, a=a,b, 
va 
where wa(t) is the element of W which is equal to n_ at t = a 
(3.2b Wa(l) = A(t, a) ne. 
Now y, = y» if and only if f € G, so we have, with the help of (3.1): 
LEMMA 2. g € Gif and only if g F and 
3.3) u(t) ij} A(t, s) Am*(s) g(s) d 


‘Integrals written without limits mean integrals from a@ to 











136 D. S. CARTER 


where u is an element of W which 1s determined by the vector c and the initia 
values n, and np: 


(3.4) u(t) = w,(t) — w,(t) + {K(i, s) c(s) ds 


Moreover, (3.3) is equivalent to the condition 


(3.5) zu = }¢(s) g(s) ds for every z Z 


where, for each z Z, the vector ¢ is defined by 


(3.6) {= 2A 


4. The function u(z). Our plan is to reduce the problem so that its 
solution amounts to maximizing a function on the n-dimensional space Z, 
rather than minimizing a function on the infinite-dimensional space F. This 
reduction depends on the following inequality, which is a direct consequence 
of (3.5) and the definition of F. 

Let j be the characteristic vector of J. Then for every pair z © Z and 
g €G, 

(4.1) cul < |lgllf\t| jds 

Now consider the set 


(4.2) V = {ale Z and | \¢ jds = 0}, 

which is clearly a linear subspace of Z. When its dimension is less than 1, 
the function 

(4.3) u(z) = 2u/}\t| jds 

is defined on the complement CV of V in Z, and satishes the inequality 


(4.4) sup | |u(z)| |z CV} < inf fligi| jg € CG}. 


We will see that equality actually holds, as long as uw is bounded. 
Some useful properties of u(z) are contained in the proof of 


LEMMA 3. When V # Z, u(z) ts bounded on CV if and only if 
(4.5) su = 0, for every z V 


And if u is bounded, then |p| attains its supremum on CV. 


Proof. Choose components with respect to an arbitrary basis as coordinates 
in Z, so that Z becomes homeomorphic to Euclidean n-space. Then both 
numerator and denominator in the expression (4.3) are clearly continuous on 
Z. If u(z) is bounded on CV, then since V forms the boundary of CY, the 
numerator must vanish with the denominator as z— V, and (4.5) must be 
true. 

Conversely, let (4.5) be satisfied. Let U be any linear subspace of Z com- 
plementary to I’, so that every z © CV is expressible in the form z = 2; + 22, 
where 2; U, Ze V, and zs, + 0. Substituting into (4.3), we find that 
u(z; + 22) = w(z,). Moreover, for every real a, u(az) = + u(z), where the 


NS 





MINIMUM-MAXIMUM FOR DIFFERENTIAL EXPRESSIONS 137 


sign agrees with that of a. Hence uw assumes all its functional values for 
on the unit sphere in U. The proof is completed by recalling that a continuous 
function on a closed and bounded set in Euclidean space is bounded, and 
attains its supremum on that set. 


5. Solution of the first order problem. ‘The way is now prepared for 


THEOREM |. The problem is consistent if and only if the condition (4.5) of 
Lemma 3 is satisfied. Moreover, every consistent problem has a solution as 
follows: 

(a) in the trivial case u = O, there is a unique solution gy— 0 

(b) in the “degenerate case’ V = Z, the problem is consistent oniy if u \). 


(c) im all other cases the function \u(z)| has a positive supremum which 
attained for some z © CV. Let 2 be any such point, and let 
(5.1 wo = p(Zo), fo = SoA 
Then every solution has the property 
Ro) 


(0) 


Zo|| = |m 

and its components are uniquely determined on the sets 
(5.3) Jo = {ti J’, fot) #0} 
by the equivalences 

(5.4) Zo ™ Mo F0/ \fo 


Thus each gy‘ ts equal in absolute value to the constant \uo| almost everywhere on 
Jo‘; and the solution is unique if every J,‘ has the same measure as the corresponding 
J 


Proof. The problem is inconsistent if (4.5) is violated, for otherwise the 
inequality (4.1) provides an obvious contradiction. Now the result for case 
b) follows, since (4.5) implies u = 0 when V = Z. And in view of Lemma 2 
the result for case (a) is trivial. 

The rest of the proof deals with case (c), in which u # 0 and V # Z 
\ssuming that (4.5) is satisfied, we will construct a gy» F that satisfies 
3.3), (5.2), and (5.4). Then in view of Lemma 2, go © G, so (4.5) implies 
consistency. And by virtue of (4.4), go is also a solution. To show that the 


components of every solution satisfy (5.4), let g; be any other solution. Since 
£1 S Zo' = io 
almost everywhere on Jo‘, there exists a vector ¢ such that go g\' ™ eo'g 


on each Jo‘, where 0 < e' < 2. Since go and g,; both satisfy (3.5), it follows 
that ; 


J co(ge — gi) ds = uo >, iP fo| eds 0) 


Hence each e' ~ 0, and gy! ~ go! on Jo. 











138 D. S. CARTER 


To construct go, consider the vectors / F and v W defined by 
i I uoto fo on Jo, 
h = ) 
0 elsewhere, 
. 

v= | K(, s) A7'(s) A(s) ds 

e 

Ifv = u, take go ~ h. Liv ¥ u, define a new problem with the same spaces I 


and Z but with u and J replaced by 


It is shown below that the inequality 


(5.3) cu’ mS lMo J ¢ jds 


holds for every z < Z, where j’ is the characteristic vector of J’. Then it is 
easy to check that the new problem falls under case (c), and that 

sup|u’(z)| < |pol. 
Moreover, the dimension of the new space V’ exceeds that of |’, for V C | 
and Zo V’, while zo ¢ V. 

The procedure just outlined may be repeated to give a sequence of problems, 
which terminates as soon as the condition v = u is satisfied. Termination 
occurs after at most m — d steps, where d is the dimension of the original 
For if the sequence continues until V is of dimension » — 1, then I” = Z, 
and it follows from (5.5) that wu’ = u — v = 0. 

If kh; denotes the vector / for the jth step, it is easy to check that 


(5.6) gy ™ _ h 


has all the required properties. 
To prove (5.5), note first that 


(5.7) (so + az) u/po < | lfo + at| jds 


for every z and every real a. Keeping z fixed, consider the vector k and the 


sets L, defined by 
Io fo/\fo| on Jo, 
0 elsewhere, 
Le = {tit Jo and |k&| + ak’ < Of. 
Phen on J» 
fo + af’) = |fo/|fol (ifo| + af’ fo/\fol)| = | fo) + ak 

and on J” 

Ce Tr ag = aii 
Hence, if jo and /, are the characteristic vectors of J» and La respectively, 


(5.8) f tC,» + at ids = ff Col + ak| jods + al f tl 7'ds 2 | Col + ak} |, ds 


he 





MINIMUM-MAXIMUM FOR DIFFERENTIAL EXPRESSIONS 133! 


Now in view of (3.1) and (3.6), 


2a = Jes) h(s) ds 


and, since fh = pokjo, 


JRiods = ov/w 
Moreover, 
J} \fo| gods = J |fo| jds = cou/ wo. 


Combining the last two equations with (5.8) and (5.7) and using the fact 
that 


QO< -— [ to + ak | < jak on Le, 
it follows that 
+ su’ bo <j c jids + 2} k la ds, 
where the sign agrees with that of a 
lor every monotone vanishing sequence |a,}, the corresponding sequences 
\L',,;{ are non-decreasing, and the infinite products Tl 7. contain onl 
points at which the &‘ are infinite. Hence the measure of each L‘, vanishes 


with a. The proof is completed by taking the two limits a — +0 in the 
last inequality. 


6. Higher order systems. The theory is easily extended to cases in which 
the operator ¥ is of higher order, by rewriting the given system in an equivalent 
first order form. The case of a single higher order expression provides ai 
interesting illustration, in view of the special results obtained below 


Taking m = 1, let ¥ be replaced by the mth order operator 


d"y ay 
(.1) Y y= ( - + ij " « c) 
: ee VT 2° w* 


where a is almost everywhere finite, and 1/a, the },, and ¢ are summabl 
functions. ||%,,x!|| is to be minimized over all functions x such that 

(a) x and its first m — 1 derivatives are absolutely continuous, and assume 
given values at the endpoints of /. 

(b) ¥&,,x* is essentially bounded, and vanishes almost everywhere on / / 
where J,, is a given subset of positive measure 

On taking 


i-—l 
PF d y 
y= — 7 ; D 
: dt’ 
this reduces to a first order problem in which » = m, all the J‘ are empty 
except J" = J,,, and the matrix A is diagonal, with unit diagonal elements 


except a” = a. Accordingly, the function yw has the form 


where the integral extends over / 











140 D. S. CARTER 





It is shown below that for every z # 0 the component 2” has no more than 
a finite number of zeros on J. Hence the set V consists of the single element 
z = 0, and the condition (4.5) is always satisfied. And if |u| attains its supre- ) 
mum at Zo, the set 


” 


Js = {tlt € Jn, 2"/a #0} 


differs from J,, by a set of measure zero. Thus we have 


THEOREM 2. For a single differential expression of the form ((.1) the problem 
ts always consistent and has the unique solution 


Jzoa/l\zoa| on J», 
Rn its r 
“ ' 0 on I — J, 


And except in the trivial case u = 0, the solution has a finite or infinite number o} 
sign changes according as the factor a(t) changes sign a finite or infinite number o 
times on J». 


Proof. Let z be any element of Z for which 2” has an infinity of zeros on / 
(hese zeros must have a limit point r € J. We will show that if, for any j 
in the range m < j < 1, 7 is a common limit point of the zeros of each com- 
de a 2’, then r is also a limit point of the zeros of z’—'. Hence 
the zeros of every component have r as a common limit point. Since z is con- 
tinuous, it follows that every component vanishes at / = 7, and hence that 
vanishes identically. 

Now the adjoint equations 


m j— j . ' 
s' — b._\."+2 ~ (), j<tigcm, 


may be regarded as equations for (2’, , 2") with the inhomogeneous term 


a! a 2 2” all vanish at ¢ = 7, it follows from the analog 
of (3.2) that 


z(t) = - j k,(t, s) 2’ '(s) ds 


where k, is continuous in both its arguments, and k,(r, r) = 1. Hence, for 
sufficiently small |¢ — r|, the continuous function z’~' must change sign between 
r and each zero of 2’. This completes the prootl 


REFERENCKS 


1. G. D. Birkhoff and R. E. Langer, The boundary problems and developments associated with a 
system of ordinary differential equations of the first order, Proc. Amer. Acad. Arts Sci. 58 
(1923), 345-424. 

2. E. J. McShane, Integration (Princeton, 1944 


Los Alamos Scientific Laboratory 





i, 


ta 





A MIXED PROBLEM FOR NORMAL HYPERBOLIC 
LINEAR PARTIAL DIFFERENTIAL EQUATIONS 
OF SECOND ORDER 


G. F. D. DUFF 


In the theory of hyperbolic differential equations a mixed boundary value 
problem involves two types of auxiliary conditions which may be described 
as initial and boundary conditions respectively. The problem of Cauchy, in 
which only initial conditions are present, has been studied in great detail, 
starting with the early work of Riemann and Volterra, and the well-known 
monograph of Hadamard (4). A modern treatment of great generality has been 
given by Leray (7). In contrast mixed problems have received comparatively 
little attention, and the nature of the boundary conditions to be imposed on 
equations of order higher than the second is known only for equations in two 
independent variables (8). For second order normal hyperbolic equations 
both linear and non-linear, the problem has been studied, using the method 
of analytic approximation, by Schauder and Krzyzanski (5) who assigned 
as boundary condition that the unknown function should take given values 
on a timelike boundary surface. The monograph of Ladyshenskaya (6) 
treats certain cases of the problem where the normal derivative is given, for 
instance when the coefficients are independent of the time variable. 

In this paper a different boundary condition is considered; this condition 
involves the derivative of the dependent variable in a given direction, which 
is defined on the boundary but is not tangential to the boundary. There are 
restrictions in the large on this direction, made necessary by the properties of 
certain families of characteristic surfaces. However, the condition includes as a 
special case the problem of the normal derivative, which arises in the theory of 
supersonic flow. 

As in (5) the analytic case is treated first, by means of dominant power 
series. The nature of the boundary conditions is taken into account by a 
certain order of choice among the dominating series. For the non-analytic case 
a suitable modification of the estimates of (5) is arranged, while the construction 
of the solution is as before. 


1. The mixed problem. We study the linear normal hyperbolic partial 
differential equation 
‘ au ; OU 
(1.1) L(t) = a“ = t+ atu =f, 
Ox Ox ax 
Received February 27, 1956. 
lhe author is indebted to Professor J. Leray for an interesting and valuable discussion of 
this problem. He also acknowledges with thanks the helpful advice and criticism of Professor 


\. Robinson. 


141 











142 G. F. D. DUFI 


with one dependent variable u and N independent variables x‘(i = 1,... , J N). 
Summation over repeated indices is understood in (1.1). The coefficients 
a“, b', c, and f are functions of the x‘, differentiable k times throughout the  } 
domain of x‘ space to be considered. The normal hyperbolic character of (1.1 

is expressed by the signature of the quadratic form 


(1.2) a” &, &, 


wv 


which signature is (V — 1,1) with one negative term. ' 
With the Riemann metric 

(1.3 ds* = ay dx'dx' 

based on the associate covariant tensor a,,, we have a classification of dire¢ 

tions v' as spacelike, null, or timelike, according as 


A ik 


ve = ayv'v av4v, 
is positive, zero or negative. Also surfaces S: @(x‘) = 0 shall be spacelike, 
null or timelike according as 
0d OA 
(1.4) Guns i 3h 
Ox ax 


is negative, zero, or positive. The normal nm‘ is defined by 


i 0 
(1.5) n' =a" <$ 
Ox 
Let S: @(x‘) = 0 be an initial spacelike surface and let T: ¥(x‘) =O bea [| 


timelike surface intersecting S in a rim C of N — 2 dimensions. We shall 
suppose that S is bounded by C and that T is bounded ‘‘toward the past’ by 
C, in a suitable orientation of ‘‘time.’’ Let G be the characteristic surface, 
passing through the rim C, which lies in the region enclosed by S and 7, that 
is, which bounds the domain of dependence D s on S according to the theory of 
the Cauchy problem. We note that G is composed of characteristic strips 
tangent to the rim C. 

On S we assign values of u and du/dn; these are the usual Cauchy data, 
‘of the 
second kind” as follows. Let v be a vector field defined on T and subject to 


and they determine u in Ds. On T we assign a boundary condition 


restrictions stated below. Then, if the directional derivative of u in the direc- 
tion of v is denoted by du/dv, we set 
Ou ; Ou a 
(1.6) — = 9 —7 = f(x), 
ov ax" 
where f(x‘) is a datum function given on T. 

On the rim C this datum function f(x‘) and its derivatives transverse to 
C in T shall be subject to certain conditions of compatibility with respect 
to the given differential equation and Cauchy data. These conditions were 
imposed to ensure that the derivatives of « up to a certain order shall be — | 

’ 


to 
ect 
ere 


he 





HYPERBOLIC DIFFERENTIAL EQUATIONS 143 


continuous across G. The value for « on C shall be that assigned by the Cauchy 
data, and the first compatibility condition is that f shall be equal to the value 
of du/dv on C derived from the Cauchy data. The second compatibility con- 
dition brings in the differential equation since it postulates that df/dt equal 
d°u/dvdt, the latter being calculated on C from the Cauchy data and the 
differential equation. Here ¢ denotes a suitable timelike variable. Likewise 
the kth condition of compatibility determines the (k — 1)st derivative of / 
as equal to the corresponding derivative of u, calculated from the Cauchy 
data and the differential equation, by successive differentiation with respect 
to tf and substitution of values already found. 

If the first k compatibility conditions hold, it is evident that u and its first k 
derivatives are continuous across G at the rim C. Now the method of con- 
struction of the solutions leads to their being continuous across G, at ever, 
point of G. To show that the transverse derivatives of u up to order k are 
also continuous across G, we note that through each point of G there passes 
a bicharacteristic ray issuing from the rim C. If a solution u is continuous 
and has continuous tangential derivatives on G, then its first transverse 
derivative u, is continuous along the entire bicharacteristic if it is so at one 
point. The conditions necessary for this conclusion will be satisfied in ou 
construction, and we infer that u, is continuous across G. In succession the 
higher transverse derivatives, up to and including order k, are proved 
continuous across G. 

The initial and boundary conditions determine wu in a larger region. Con 
struct the retrograde characteristic cone C” with vertex P. If C”, T, and S 
together bound a region, then P lies in this domain of dependence on S and T 
Since the Cauchy initial value problem can be regarded as solved, we subtract 
its solution from the dependent variable and so find a reduced boundary 
value problem which may be stated as follows. We must find a solution of 
(1.1), with f = 0, which vanishes on Go, satisfies (1.6) on 7, and is defined in 
the region V intermediate to S and 7. The compatibility condition is now 
that f(x‘) = 0 on C, while the corresponding condition of order k is that 
the derivative of f(x‘) of order k — 1 in a direction within T but transverse 
to C should vanish. We note that the rim C, being a subspace of S, is spacelike, 
and that all directions tangent to G are either spacelike or null. Indeed G, 
being an integral surface of a first order partial differential equation, is com- 
posed of the bicharacteristic curves passing through C which determine 
characteristic strips tangent to C. 

Now let C, be a family of spacelike (N—2)-dimensional surfaces filling 7, 
and such that Cy = C. We may construct characteristic surfaces G, containing 
C, and these will fill up the region V. The condition which we impose on the 
vector field v is that v should not be tangent to the local G, at any point of T. 

That some restriction of the vector field relative to characteristic surfaces 
is necessary can be seen from the equation of the vibrating string; 


UMrr = Uy. 











144 G. F. D. DUFI 


If we require u(x, 0) = u,(x,0) = 0 fort = 0, x > 0, and 


a(t)u, + B(t)u, = f(t), r= 0, /> 0, 
then 
*i-T 
i(r ; 
u(x,t) = | , at) dr, > i 
Jo B(r) — alr) 
and the denominator of the integrand vanishes if a(r) = 8(r); that is, if the 


vector field takes the direction of the forward characteristic entering the 
region through 7. This condition holds for all such equations in two variables 
it is easily seen that the directional derivative along any forward characteristi 
is determined by the Cauchy problem with data taken at the instant ¢ where 
the characteristic curve meets 7. 

If N > 2, the situation is more involved. For the analogous condition, 
namely that v should not touch G,, we may ask: what conditions on v enable 
us to construct a family of spacelike varieties C, on T such that v will never 
be tangent to the G, constructed on C, as base? Such a field will be called 
admissible. 

To answer the question we recall that the G,, being integral surfaces of a 
first order partial differential equation, are constructed as envelopes of portions 
of the characteristic conoids with vertex on C,. The portion concerned is that 
part C”, of the forward half-cone lying in the interior of our region. 

Let us consider the tangent space at a typical point P of T:y = X = X* 
=(). In the surface T the tangent plane to C, will take the form (with P ; 


L 


origin) 


t— ¥ Cate = 0, a=1,...,N—-2, 


where >> ca” < 1 since C, is spacelike. The tangent plane to the cone C”, in 
the full space, which also meets T tangent to C,, will have the equation 


t = Ss CaXa —~ © (1 = z Ca’) x= 0). 


We must determine those regions of space such that there exist values of the 
Ca Which render the function 


f=t— ¥ cota — V(1 — ¥ ca?) & 


consistently of a given sign. 

There will later appear the restriction that the vector v should not touch 
T, and we shall, for convenience, make this assumption here. The region of 
space to be considered may now be taken as the side x > 0 of T. Two cases 


arise, according as 


the initial value of f with the components of v substituted for the coordinates, 
is positive or negative. These correspond to v lying initially “‘later’’ than G 
or “earlier,” and will be referred to as the positive and negative cases 
respectively 


wl 


oe 


————____ 


HYPERBOLIC DIFFERENTIAL EQUATIONS 145 


raking first the case of negative values, we shall minimize f with respect 
to the ca. Since 
of C 


—=—- = — xX, + rm x 
Oa ” Y(l - 7) 


a 


and 
af baax Ca Ca x 
OC g09C8 V(1l - wh,’ + (l — Y Ca ) =e 
we find, first, that an extremum is present for 
Xe 

= VJ (x me > xa) ’ 

and secondly, that this is a minimum value. The actual minimum is therefor« 
Imin Me *E— wy (x* + vx) 

which will be negative if 

tf<va (x? + 7. Vea) 


That is, points on the forward cone C”,, or within it, are excluded. 
rhe positive case is a little different, since no true maximum of f/f exists 
\s we have x > 0 the third term of f is negative, and it follows that if 


takes positive values for some Ca, then so does f for sufficiently small positive 
v. Now the sum in f will take its greatest value when }>> c,’ is allowed its 


yreatest value. Thus we may take }° ca? = | and so find the extrema of 


fo=t{— ¥, Guta +aA(l — Ga 


Hence 
Ofe 
—- = —Xq — 2h, = 0, 
Ole 

and so, with >> co? = } ve?/4A? = 1, we find 


—Xe 
a V/ (dx) 
and the maximum of /; Is 
fimax = t — Litre = t+ V(Lxe). 
Thus we get positive values for /;, and since >> ca? = 1, also for /, provided 
{> 7 V(> Ya* 
The bounding surface so defined is cylindrical in the x direction, and touches 
the cone C”, along its intersection with the surface T 


If the condition of not touching holds at a point P, then by continuity it 
holds in a neighbourhood of P. Over a compact portion of 7 we can find 








146 G. F. D. DUFF 


uniform moduli of continuity, provided that the limiting cases mentioned 
above do not arise. A neighbourhood of uniform size can thus be defined, 
and the construction extended to the whole of the compact region by repeated 
application of the existence theorem. 

This result may be summarized as follows: 


LEMMA |. A vector field v not tangent to Gy or to T is admissible if 
(a) being initially positive, it satisfies 
Ff ¥ (> Va") on T: 


(b) being initially negative, it satisfies 


v,< V(v,? + > v2?) on 1 
We remark that the normal vector field, with one non-vanishing component 
v, > 0, falls under case (b). 


2. The analytic case. Let all coefficients in the differential equation, 
and the surfaces S and 7, be analytic. Then characteristic surfaces such as 
the G, are also analytic provided that the rims C, are analytic. This can be 
arranged and will be assumed. 

Before reducing the differential equation to a standard form (4, p. 76) we 
shall simplify the boundary condition 
(2.1) ou =f. 

Ov 
Here f vanishes to order k + 1 on Cy according to the compatibility conditions 
We note that the vector field u is not parallel to Gy) on T and thus we can 
construct a C* function u; which vanishes on Go and also satisfies (2.1) on T 
Subtracting this function from u, we obtain for the new dependent variable a 
differential equation of the form 
(2.2) L(u) = fi 


while the new boundary conditions are (cf. §6), 


(2.3) u = Q, on Go, 
with 
) 
(2.4) = = 0, on T. 
ov 


We now change the independent variables so as to give T the equation 
x = x%-! = () while the analytic family of characteristic surfaces G, have 
equation 

G,:t = x™ = const. 
This forces the coefficient a*” to vanish identically in the new system. Since 
the rim C, is spacelike, and so never tangent to a‘ icharacteristic direction, 





1ed 
ed, 
ted 


we 


Go, 


nce 


on, 





HYPERBOLIC DIFFERENTIAL EQUATIONS 147 
we can choose the remaining variables x',... , x*-* so that the bicharacteris- 
tics on G, are 

(2.5 xr? = const., p * _N 2 


This results in the vanishing of the coefficients a**. Following Hadamard 
~ . 
(4), we divide by a*-*—' which cannot now vanish since L(x) is not parabolic; 


and we replace u by 
u exp| | wae 


which causes the term in L(u) containing du/dt to disappear. Then the 
differential equation becomes 
au 


(2.6) ——-— = 
oxat 


L;(u) + fo, 
where the operator L,(u) contains no differentiations with respect to ¢. With 
this form of the equation Hadamard and others studied the indeterminacy 
of Cauchy's problem for characteristic surfaces. 

The boundary conditions to go with (2.6) are now 


2.7 u = for 1 =0 

and 
ou ; Ou y 

(2.8) — =o —; = bu, x = (0 
ov Ox 


In order to express this latter condition more conveniently, we note that by 


hypothesis the component v* does not vanish—this is our condition on the 


vector v. Dividing by v* and transposing some terms, we have 


. v—1 . 
Ou Ou 
(2.9 yy B' ay + hu, x = () 
Ox 


ot k=l 
We now expand u in a series of powers of ¢, and determine the coefficients 
in succession. Let 


(2.10) “u= > ut, fe= > fi", 
n=l n=) 


and also let 


Li(u) = >> t'Lyr(u). 


Then the u, satisfy 


Ou 
(2.11) ——" Los (tty to. ee 

Ox 
where the terms omitted contain the u,(k = 0,1,...,” — 2). We have taken 
ny = 0 to satisfy (2.7). Substituting these expansions into (2.8), we get the 


conditions 











148 G. F. D. DUFF 


915 p IUn—1 

(2.12) Nt, = >, Bo — Fe + hota +..., r=0 
; Ox 

Here 8*, and hy are initial terms in the expansions of the 6* and h in powers of 

t, while the terms omitted in (2.12) again contain mu, (k = 0,1,...,m” — 2) 


Thus the u, are uniquely determined by integration of (2.11) for successive 
values of m, in the form 


(2.13) nu,(x) = nu,(0) + | [Lor (tn—1) + far +...) dx’, 


and the functions so found are analytic in x as well as in the remaining variables 

The techniques of dominating series will now be applied to show that the 
series solution thus found is convergent in a certain domain. We note that 
the operations in (2.13) are such as to preserve any dominant relation; thus 
if we dominate the coefficients in (2.6) and (2.9) the new solution will dominate 
that already found. Now the two auxiliary conditions will be dominated in 
the following way. We shall seek a solution with positive coefficients of the 
dominating differential equation. This solution will automatically dominate 
the condition (2.7). We will also show that if the left side of (2.9) is computed 
(in the dominant case) it will dominate the right side, and therefore wiil 
dominate the actual condition (2.9). This requires a certain order of choice 
among the various dominating constants which will appear. The proof will 
also show that the series has a radius of convergence independent of the 
function f2 in (2.6), and hence independent of the data prescribed for the 
original problem. 

We choose as origin a point of Cy and set 


ge s'+...4+2°"" 


and let p, « be sufficiently small positive numbers. Then the dominant boundary 
condition can be written 


ou t Y) . , Ou | 
9 = — —_er , “ 
(2.14) at (1 Ni y > G,; ax! + Hu}, 


where G,(2 = 1,...,. V — 1) and H are positive constants. Letting 
a t t 
(2.15) t= — oc log\l — =i+=—++..., 
o ao 


we can write this 


o —1 a 
Ou y , OU 
(2.16) — = (, ~ 2) |= Giza + Hu |, r=0 
ox 
Denote >>; G; by G. 
In proving the convergence theorem we will actually assume that the left 
side of (2.16) dominates the right side. Since the series in (2.15) has positive 
coefficients, this will imply that the left side of (2.14) dominates the right 


SI 


() 


it 





HYPERBOLIC DIFFERENTIAL EQUATIONS 14 


side, and hence that the boundary condition (2.9) will be dominated as 
required 
The dominating differential equation takes the form 


Ou ~ (1 ) ' ( + Y) 
axdt 0 p 


au aoe ; 
2, da + DV Bivat+ Cutt 
i j ox 


ax’ Ax” 


te 
~! 


where the A ,, B,;, C and F are constants. Here only F depends on /s in (2.6 
and we have therefore to find a radius of convergence independent of F 
Let us assume that U is a function of 7, x, and y only. Then (2.17) becomes, 


with use of (2.15), 


a2 1 ‘ - = 
Ou + ie es ou ou ou 
an): -(1 — v) Land 3+ Ay + An 3 
OxOr p Ox Oxdy Oy 


Ou Ou 


+ By — + B+ Cut Fi, 
; oy , 


where the A’'s are chosen anew if necessary. Since we can always increas 


them, we shall require that 


(2.19 A =A; +Awt+A >G 


If we further assume that U is a function of the combination 


Ww=TTalx +r y) 


alone, where a > 0, then we can find an ordinary differential equation for 
U(w) which dominates (2.18) and therefore still more dominates (2.6). To 
do this we replace x + y in the denominator of (2.18) by x + y + r/a = w/a 
Collecting terms in the ordinary differential equation to which (2.18) now leads, 
we find 
lon < Ww oe “ss Cc I 
(2.21) ( _ ~ ad) = BU’ + U + 

ap a a 
Here primes denote differentiation with respect to w. We now choose a so 
small that 


(2.22 |1—aA m i 


\ccording to (2.18) we would set B in (2.21) equal to B,; + By». Since we 
can increase B freely without destroying the dominance over (2.6), we shall 
stipulate that 


1 — ad l 
(2.23) . B>H—— + -. 
1—aG ap 
Defining r = ap(| — aA) we can write (2.21) in the form 


, . 
= on w) [ B ( r | 
2.24) (1 - — i ee ae vol = adil 











150 G. F. D. DUFI 


If in this equation U(0) and U’(0) are positive, then all coefficients of the 
series solution are positive. 

Indeed, if we set 
(2.25) U = > a,w", 


the recursion formula for the a, is seen to be 


n B | C 
(n + 2)(m + 1) nye = n +1) + OQn+1 + an 
r l1—aAJ a(l — aA) 
FS, 
+ 
a(l — aA 
where the last term on the right is present only if » = 0. Assuming now that 


ao and a, are non-negative, we find 


B 
(2.26) (n+ 2)anss > |" + Jone 


r l1—aA 
This relation is used below. 

Now consider the boundary condition (2.16). The formulae (2.9) and (2.12 
show that the initial values for the u, will be dominated if the left side of! 
(2.16) dominates the right side. Still more will this hold if y in (2.16) is replaced 
by x + y + t/a = w/a. With this modification we get for U/(w) the condition 


? 1 
(2.27 ) uw) >(1 - 4 laGU' + HU] 
ap 
which will hold if 
(2.28) (1 ~~ a) U'(w) > HU(w). 
ap 


To verify that (2.28) implies (2.27) we recall that U is a series with positive 
coefficients; thus if we add to each side the series aGU’ and then multiply 
on right and left by the series for (1 — w/ap)-' we will not destroy the 
dominating relation. 


To demonstrate (2.28) we calculate the coefficient of w” on the left; it is 
. l 
(1 — aG)(nm + 1) dyai - na, 
ap 


which by (2.26), with » + 1 changed into 2, is not less than 


~| &#— | B l 
(1 —aG | as Jo — — nd, 
r 1 — aA ap 


From (2.23) we find that this in turn exceeds 


E @)( n— 1 H 1 — ad 4 I xs =|, 
o ap(l — aA) T1-e 1-24 ap(1 — aA) ap) " 
>| a+ af! — aG ) h aG |. 
ap\l — al ap(l — aA) 





ie 








HYPERBOLIC DIFFERENTIAL EQUATIONS 151 


Since G < A the middle term in the bracket is positive and thus the coeffi- 
cient exceeds HT which is the coefficient of w* on the right of (2.28). This 
proves that (2.28) and (2.27) hold in general and hence that (2.16) and so 
(2.9) are dominated in the required way when x = 0. 

For dominating power series we therefore choose a solution U(w) having 
positive values for U(0) and U’(0). The radius of convergence of this series 
is equal to r = ap(l — aA), from the theory of linear differential equations, 
and this is independent of F. 

Repeating this work at other points of Cy we can show that the unique 
analytic solution thus found exists in a neighbourhood ¢ < 6,, of Cy». Here 
6, is independent of the datum functions of the Cauchy problem as well as 
the mixed boundary condition. If we select any compact portion of T such that 
the above hypotheses are uniformly satisfied when any one of the characteristic 
surfaces G, is chosen in place of Gp then we can find a 6; which will serve for 
them all. 

Combining the local solution just constructed with the solution of Cauchy's 
problem for the analytic case, we see that the resulting composite solution is 
analytic except possibly on Go. If the datum function f(x‘) originally given 
satisfies the compatibility conditions of §1 up to the order k inclusive, then, 
by well-known properties of the discontinuities across characteristic surfaces 
of derivatives of u, it follows that u and its derivatives up to order & inclusive 
are continuous across Gy. We state this result as a lemma 


LemMMA II. Let the compatibility conditions up to order k inclusive be satisfied 


in the analytic case; then there exists a unique solution analytic for 0 <t < 6, 
except on Gy, where the derivatives of order up to k inclusive are continuous. 


The domain of definition of this local solution will be extended in §4. We 
note that for the purposes of this local analytic solution it is sufficient to have 
(2.9) and thus v may be tangent to 7. 


3. Estimates of solutions. To extend the result to non-anaiytic equations 
and data, we give estimates of the square integrals of the solution and its 
derivatives up to a certain order. These are found by a modification of the 
method used by Krzyzanski and Schauder (5), which in turn is based on the 
work of Friedrichs and Lewy (3). For brevity we shall indicate only the 
alterations necessary for our purposes. In this section we take the geometric 
background to be Euclidean. It is also convenient to suppose that 7 is 
cylindrical in the sense that S spans T and the rim Cy = S()\ T is closed 

Since the Cauchy problem is regarded as solved, we can take as initial 
spacelike surface any spacelike surface which spans T, meeting T in the rim 
Cy. We shall construct a family of surfaces S, spanning T, with Sp (\ T = Co, 
and such that the given vector v is never tangent to the S,. The direction 
field v is again assumed admissible; thus there exists a family G, of characteristic 
surfaces, with Gy (\ T = Co, such that v is not tangent to the G,. Let us extend 











152 G. F. D. DUFI 


v to a field defined throughout a region of space containing the G,; il 


this 
region is sufficiently small we can, even in the analytic case, determine 1 


‘ 


analytically so that it is never tangent to G, in the region. Denoting the 
minimum angle of v to G, by 9%, we construct spacelike surfaces S, as follows 
S, shall contain C, = T(\G,; and S, shall be inclined to G, at an angle 
between $6, and 34) at every point. These surfaces S, may be chosen to be 
analytic in the analytic case. Now we see that v is never tangent to the S 
We now set up a coordinate system on the family of surfaces S, with equation 
t = const. We choose the coordinate network x', ,x*-! on S, in such a way 
that the parametric lines of ¢ cross T at every point from inside to outside 
with increasing ¢. This can be achieved by a change of scale in a suitabl 
“radial” coordinate in S,, and does not alter the spacelike character of S, 
Since v is never tangent to S,, we could take as Nth coordinate, in place of f, 
a suitable parameter —* along the integral curves of v. The transformation 
ol coordinates so defined is clearly non-singular; and will be used below in 
certain surface integrals taken over 7. By measuring arc along the v curves 
starting on T we ensure that T has in these coordinates the equation 


However, this requires that v should not be tangent to 7’, which we therefore 
assume for the rest of this section. This condition has been anticipated in the 
form of the statement of Lemma | 

Let V, be the region bounded by S, 7, and S,; and let 2,, ne, n; denote the 
outward Euclidean normals on the surface of S, T, and S, respectively. I! 
cos(nx') denotes the cosine of m with the parametric line of x‘, then we 
have 


(3.1) cos(m,t) < O, cos(mot) > 0, cos(mgt) > 0. 


We now multiply the differential equation (1.1) by du/dt and integrate 
over I’,. After some partial integrations we find 


F : Ou du 
‘ ‘ ‘ ik 
(3.2) | 2>> a* —| = cos(nx") 
7 S+T+S8 , at 


+ Sy kel Ox 
~ OU OU : 
— 2 a’ —z ~;cos(nt) |dS 
k= I Ox Ox 


where ® is a quadratic expression in u and its first derivatives, involving also 
the coefficients in (1.1) and their first derivatives. 

A separate choice of variables is now made in each of the three surface 
integrals on the left in (3.2). The coordinates x', = 


but the last coordinate is taken to be 


are not changed 


(3.3) St) a eS Las 


sO) 


wn 





HYPERBOLIC DIFFERENTIAL EQI 


where the g, are the functions giving the 


equated to zero. Thus 


gi =f, 


\fter some calculation, which we omit, it 


ge 


gs =! 


the integrand on the left in (3.2) becomes 


where 


i.k=l 


Vl ‘ . 
:e ik Ou Ou 
a _ 
" Ox 


Ox 


N 
a\g;,) = ) 
i.k=l 


is the characteristic quadratic form. Since S and S, 


timelike, we see that 


a\ £1) < 0, 


in view of the convention of sign in (1.2 


a\ £2) 


> 


A 


ty 


ATIONS 


equations of S, 7° and 


cos(n,f) 


Og, OO” 
ax’ ax 


0, a(gs) 


S, when 


is possible to verify as in (5) that 


are spacelike and 7 


0, 


). Noting that the quadratic form 


V1 
doa" kb 
i 1 
is positive definite, and cos(m3f) is positive, we see that for s = 3, 


{ 


hat ts lor 


the integral over S,,, the integrand is negative definite. We shall now drop the 


subscript 0 on the ¢ in S,. 


Since cos(moef) is positive, the term 


= a 


k=l 


ip OU 
ax" 


on 
Ox 


d 


~ COS( Mol ) 


is also negative definite. We transpose to the right side of (3.2 


the other 


term in the integral over T and also the whole of the integral over S, and find, 


after changing the sign throughout, 


. ‘ e 2 N—1 ‘ 
ou “x OU OU 
— al £3 ( a ) + 6 = j 
J 5, 5\ ant ee ox! ax! 
. v—1 ou Ou 
3.4) . a =z —-cos(mt) dS 
Jr > ax’ ax’ c 


. ) . . 
=— | ( a (*\ avs + a(gs)( 
J} ot e J 7 c 


cos (nl 


dS 


rhe left hand side of this equation is now positive definite in all of the deriva 


tives appearing, and in particular the integral over T on the left is non 


negative. Thus we can drop this term provided < 


is substituted for the 


equality sign, and the inequality so obtained is in the right direction for our 


purposes. 


Indeed, there is a positive constant ¢c such that the left side of (3 


CXC eeds 











lot G. F. D. DUFF 


Now we find an upper bound for the right hand side by conventional methods, 
and so obtain an estimate 


NV . ou ) ; | . N ( ou ) 
(3.5) 2. J. (2 pds <K jz =i ) aI 
+ | f'dV + Siz b ( =) + u ‘|as + £ (2 os yas | 
S Vee i=l OL 


for some constant K independent of u. The expression @ may contain w itself 
and so we have estimated integrals of the type 


| udV 
Jy 
by writing 
u = uy + | u dt’ 


| wav<2| uo dV +2 ( | u,dt) d\ 
“vi v7vV vv, a0 


t 


all 


< 2t j uo dS + 2t° j u, dV. 
Js Jv, 


These terms are incorporated on the right in (3.5). Integrating from / = 0 
to t = ty in (3.5), we get 


| 2 3 ) “| } J. |x e = "e «| | 
3.6 » '< jo 
(3.6) -_ 2. {; ax! dV < K, Bo. f'dV + 2 ax! +u |d 


provided that fy is sufficiently small. Replacing the first term on the right in 
(3.5) by means of (3.6), we finally get a similar estimate for the surface integral 
on the left in (3.5). 

Similar estimates for the higher derivatives of u are needed; as they can be 
found by modifying the calculations of (5, §3) in the manner indicated above 
the details will be omitted. We quote the result as follows: Let D,u dante a 
typical partial derivative of order h, and let p be a positive integer. Let the 
coefficients a“, b*, c, and f of (1.0) have bounded derivatives up to and including 
the order N + 1; and let these derivatives of order up to N + p — | be 
square integrable over the domain. Then there holds the estimate 


> 


(3.7) ~f (D,u)°dV <KLE 


h=0 


N+oe 


a | (D, fy dS 


N+o -) 2 +2 . 
pi (p, : -\} dS + > | (Dyu vas | 
h= » eh ¢ h=t @S8 


and a similar estimate holds for 


HYPERBOLIC DIFFERENTIAL EQUATIONS 10 


h=0 


N+e . 
(3.8) > | (Dyu)*dS,. 


In these formulae the summation >> over h is to be taken over all derivatives 
of the order indicated. However, in the integral over T the summation >-’ 
over h shall include only derivatives tangential to T and so only one difleren- 
tiation with respect to —* = 7%, will appear. Again, by integrating (3.8 
over f, we find the sharp form 


N+. . 
(3.9) > j (D,u)'dV < tk| 
h=md JV, 
of the estimate for the volume integrals. This is to be used in connection with 
quasi-linear equations, which we shall mention in $5 


4. Extension of the domain. The following lemmas of Schauder and 
Krzyzanski (5, §6) will be used here and in the next section. Let R, denote 
the class of functions v(x‘) having absolutely continuous derivatives of order 
<k — 1 in the sense of Tonelli; and quadratically integrable derivatives ol 
order <k; all on a given closed domain such as the region V,. Such a function 
is absolutely continuous in the above sense when it is absolutely continuous 
on almost every parametric line of the chosen coordinates 

With the norm 


(4.1) vil, = > | (Dw)'dV, 
hat V7 VY, 


the class k, becomes a linear space R,. We have 


LEMMA II1l. Polynomials p(x‘) are dense in R,. Provided that v © C' on a 
compact subset V, of V, an approximating sequence p,(x') can be found such 
that the derivatives of order <r of the p, converge uniformly to the derivatives of v 


LEMMA IV. Let k > N + 1, and let v, © Ry be a sequence with uniformly 
bounded norms |\v,\|, < K. Then there exists a subsequence v,, uniformly con- 
vergent to a limity © R,. 


The uniform convergence is established by Lemma | of (9) while the fact 
that v belongs to R, follows (1) from the theory of strong convergence in L* 
We now extend the domain of definition of the solution of Lemma II. Let 
0 <6 < 6, and let us divide the larger domain V into pf slices V, of width 6 


Vj: -—Db<t < je, j= 1,2,....p 
Let S, be the surface ¢ = 76 and C, = S,;(\T. In each of these domains we 


shall construct solutions which will subsequently be pieced together. The 
solution in the large so found will be class R,, provided that the compatibility 
conditions of order <& hold initially, and that k > N + 2 

Let «, be the local solution defined in V;. We cannot apply Lemma I! 


to l’. by taking values of 4; and du,/dt on Sz as Cauchy data since these may 











156 G. F. D. DUFI 


not be analytic in the large. However, as is remarked in (5, §7), u; and du, d/ 
have derivatives of all orders on S; except on Gof\S,; and satisfy 
the compatibility conditions of all orders on C,. We therefore approximate 
them by polynomial sequences ¢;, and ¥, such that (a) the derivatives ol 
orders <k and k — 1 respectively converge uniformly on S; while (b) on C, 
the derivatives of orders <k + N(p — 2) converge. These approximations 
are possible, by Lemma III. 

To each pair ¢;,, ¥:, there corresponds a solution u,, of the Cauchy problem 
with data on S,;. We define these solutions throughout V, by selecting analytic 
boundary values x, which satisfy the compatibility conditions relative to 
U2,, on Cy, up to the order k + N(p — 2). Thus the derivatives of x, of order 
<k + N(p — 2) converge to the corresponding derivatives of f on C;. Now 
Lemma II shows that we, is defined and of class Ryyyip—~» in V2. 

To extend these solutions to V; and beyond, we approximate wu, and 
Ou2,/ dt on S, by sequences ¢3,, and wW;,, of polynomials. By Lemma III these 
approximations can be made uniform for derivatives of order <k + N(p — 2 
Again we define solutions u;,, in V3, with Cauchy data @3,, and w3,,, and 


boundary data x;,,, where x;,, is a polynomial satisfying the appropriate 
compatibility conditions of order <k + N(p — 2). By Lemma II, the solu 
tions “;,, exist in V3; and are of class Ry4y;,~2) there. Also, by (3.7 

(4.2) Ussr\|k+n(p—2) S K, 


where A is independent of r and s. By Lemma IV, there exists for each 
a subsequence u;,,, convergent to a limit 3, of class Rey sip—s) in V3; thus u 
also satisfies the differential equation and the estimate (4.2). We now approxi 
mate to values of u;, and du;,/dt on S; in order to define Cauchy data for a 
sequence of solutions u,4,, of class Ry, ..,~s) in V4. The approximations to th 
boundary condition are again of class C,4.;)—s) and there exists a sequence of 
solutions u,, of class Cy4~p)~4) in V4 which satisfy a uniform estimate of the 
type (4.2). 

Proceeding in this way we define a sequence of solutions u,, of class 
Reznip—-» i V,, all satisfying an estimate of the type (4.2). We now piece 
together the solutions u,,, for fixed s, to give solutions u, defined in 
Vo+ V3+...+ V,, which are of class R,. By Lemma IV there exists a 
subsequence uniformly convergent to a limit u of class R, in V2 + V3+...+1 
\ssuming that k > N + 2, this function u has continuous first and second 
derivatives, and satisfies the differential equation and the boundary condition 
Also, by the manner of its construction, this solution merges with the original 
solution in V, to yield a solution U of class R, in V = Vi + V2 + + V, 
This completes the proof that the solution can be extended to a domain ol 
arbitrary extent. 


5. The non-analytic case. In the differential equation (1.1) let all 
coefficients and f be of class R,_, (where k > N + 2) in V, and let the data 


—» 


HYPERBOLIC DIFFERENTIAL EQUATIONS 157 


of the mixed problem be of class C***. Let W (in Schauder's notation) be a 
domain which contains S and is contained in the region of dependence on S 
alone. We suppose that the differential equation is of class C*** in W. Finally, 
the conditions of compatibility up to order & inclusive shall be satisfied 
We state the result in this case as follows. 


THEOREM. There exists a solution u of the given mixed problem, which is of 
class R, in V, of class Ris and C* in W, and which satisfies an estimate of the 
form (3.7). 


The proof involves approximation to the coefficients of the differential 
equation by polynomials. From Lemmas III and IV we see that the approxi- 
mating coefficients a,“, bs, c, and f, (s = 1, 2,...), together with derivatives 
up to order k — 1 inclusive, can be chosen to converge to their respective 
limits (a) in V, in mean and (b) in W, uniformly. As in the preceding extension 
of the domain of §4, the data can be approximated by polynomials which 
retain the k compatibility conditions. According to §4, there exists a solution 
u, of the approximate problem, with 

Usile < K. 
From Lemma III we infer the existence of a uniformly convergent subsequence 
tending toa limit u € R,. Sincek > N + 2, uw is C? and satisfies the differential 
equation and the boundary conditions. In fact u is of class Ry, in W as 
follows from the theory of the Cauchy problem (9). That the solution is 
unique follows from the estimate (3.7). 

A boundary condition of the third kind (in potential theory) can be reduced 

to that treated here. If 

— + hu =f, 

Ov 
where h and f are functions of position on 7, then the reduction in (2) will 
apply 

With boundary value problems for hyperbolic equations there is an evident 
analogy with potential theory, and Hadamard (4, p. 248 ff) discusses these 
problems in that light. However, the case of a plane boundary treated by him 
is essentially easier than the general case since in effect it can be solved by 
the method of images. The result found in this paper has a greater generality 
than one would expect by this analogy, since the case of the oblique derivative, 
which in potential theory requires special methods, is included 

In conclusion we note that Schauder has also treated the quasi-linear and 
non-linear mixed problems with the values of u assigned on 7, (5, 10). His 
methods extend without difficulty to the boundary condition studied here 
Indeed, in the quasi-linear case, the linear solution is used to define a functional 
transformation, and then with the help of the sharp estimate (3.9) it is shown 
that a fixed point of the transformation, and hence a solution, exists for 
sufficiently small domains. The non-linear problem is reduced by differentiation 
to a quasi-linear integro-differential system which can be solved under the 











158 G. F. D. DUFF 


same conditions as the quasi-linear hyperbolic equation. Schauder’s proof of 
the integrability conditions for this system, which establish the existence of 
the solution for the non-linear equation, requires no modification in the present 


case. 


6. Removal of the compatibility conditions. The preceding result has 
been stated under conditions similar to those in (5) with the boundary condi- 
tion of the first kind. In both of these theorems the compatibility conditions 
of order up to k are somewhat inappropriate in view of the theory of discon- 
tinuities of derivatives of solutions of hyperbolic equations. To remove this 
limitation, we shall need to strengthen the differentiability conditions. 

Consider, therefore, the first boundary condition when g + 1 (0 <q <k 
compatibility conditions hold. We shall actually treat the case g = 0 since the 
continuity across Gy of the derivatives up to order g is easily established later 
in the appropriate cases. Thus, taking the Cauchy data to vanish and con- 
sidering the homogeneous differential equation, we assume only that f(x' 
vanishes to the first order on Co. Let the differential equation and data be of 
class C**, and let us reduce this problem to that treated in (5) by setting 


We shall arrange that L(v) be C* everywhere and that x shall satisfv a boundary 
condition on T which is compatible of order k. 

More precisely, we set v = 0 in the Cauchy domain between S and G, 
and require that L(v) should vanish to the order k as G is approached from 
above. The function v itself shall be continuous, shall vanish on G and shall 
satisfy on T the boundary condition 


M 


where the /, are the coefficients of f in a Taylor series expansion in powers of / 


For u,; we now have 
k 
wm=fef— > ft" on 7 
n=l 


Since we have assumed L(u) = 0 in the reduced form of the boundary value 
problem, the first compatibility conditions for u, will be the vanishing of the 
appropriate derivatives of f. Thus the problem for x, is of the above type, 
since we have in effect taken u = 0 in the region W of the theorem. This 
shows that the problem is in this case reduced to finding v. 

For this purpose we note that all functions and coefficients can be expanded 
in a Taylor series of powers of ¢ (where ¢ = 0 is the equation of G) up to terms 
of order ¢* and with a remainder of this order in ¢. The coefficients of terms 
containing ¢’ are derivatives of order r, and so are C*~’. Hence all such coeff- 
cients are C*. Following Hadamard (4, pp. 78-79), we construct the first k 
terms of the series in powers of ¢, of the problem 


as 





“ 


HYPERBOLIC DIFFERENTIAL EQUATIONS 159 


L(w) = 0, 


with w = 0 on G and w = f on T. 
By the manner of its construction, the function 


k 
v= y W,t" 


satisfies 


fol 


)= D fat” on F, 


and 
L(v) = t**'r 


where 7 is a C* remainder term, in the region between G and 7. Thus L(v) 
has continuous derivatives up to order k across G and the reduction is estab- 
lished. 

If the first g + | compatibility conditions hold, then it is easy to show 
that « is C* across G, g < k, by considering the discontinuities of successive 
derivatives across G and noting that since they vanish on C they must vanish 
along all bicharacteristics issuing from C. We may now state the existence 
theorem of (5) with this modification 


Let the differential equation and boundary datum function f be C*,k > N 4- 2, 
and let the first q + 1 compatibility conditions hold. Then there exists a unique 
solution of L(u) = 0 in V with given Cauchy data and with u = f on T. Th 
solution is of class R, in V except that if q < k — N it is of class C* across G 


The domain of this solution is, however, restricted to the domain wherein v 
has been defined and so does not include any multiple points of the characteris- 
tic surface G. 

\ similar reduction for the boundary condition of the second kind, con 
sidered in this paper, is possible. Here, however, it is not necessary that any 
compatibility condition should hold. We calculate v as the first k terms of the 
analytic series expansion in §2, and proceed as above. The result may be 
stated as follows when g compatibility conditions hold. 


Let the differential equation be C* and the boundary datum function be C” 
k > N + 2. Let the first q compatibility conditions hold on C. Then there exists 


a unique solution of L(u) = 0 in V with given Cauchy data and with 
Ou 
—— a J 
ov 


on T. The solution is of class R, in V, except that if q < k — N it is of class C¢ 


across G. 


Again the domain is limited by multiple points or self-intersections of the 
characteristic surface G. 





G. F. D. DUFI 


REFERENCES 


5. Banach and S. Saks, Sur le convergence forte dans les champs L”, Studia Math., 2 (1930), 
51-57 
F. D. Dull, Uniqueness in boundary value problems for the second order hyperbolic equation, 
Can. J. Math., 8 (1956), 86-96 
Friedrichs and H. Lewy, Ueber die Eindeutigkeit und das Abhdngigkettsgebeit der Lésun- 
gen beim Anfangswertproblem linear hyperbolischer Differentialgleichungen, Math. Ann 
98 (1928), 192-204 
Hadamard, Lectures on Cauchy's problem in vinear partial differential equations (New 
York, 1952) 
Krzyzanski and J. Schauder, Quasilineare Differentialglerchungen zweiter Ordnung vom 
hyperbotuschen Typus: Gemischte Randwertaufgaben, Studia Math., 6 (1936), 162-189 
Ladyshenskaya, Smesannaya cadaca dlya gyperboliceskova uravneniya (Moscow, 1953) 
Leray, Hyperboisc Differential Equations (Princeton, 1953) 

. Robinson and L. Campbell, Mixed problems for hyperbolic partiat differential equations, 
Proc. London Math. Soc. (3), 5 (1955), 129-147 


Schauder, Das Anfangswertproblem einer quasilinearen hyperbolischen Differential- 
gleichung Fund. Math., 24 (1935), 213-246 

, Gemischte Randwertaufgaben bei Paruellen Differentialgleichungen vom hyperboli- 
schen Typus, Studia Math., 6 (1936), 190-198 


University of Toronto 





W Le Y ec 4 


NONPARAMETRIC METHODS in STATISTICS 

By D. A. S. FRASER, University of Toronto. Unifies and explains 
nonparametric methods from diverse areas of statistical activity. In the 
early chapters the author develops the statistical theories of hypothesis 
testing, confidence regions, and tolerance regions. Full coverage is given 
to limiting distributions, and a complete chapter details limiting methods 
for nonparametric test statistics. One of the Wiley Publications in Statistics, 
Walter A. Shewhart and S. S. Wilks, Editors. 1957. 299 pages. $8.50. 


IRRATIONAL NUMBERS 


By IVAN NIVEN, University of Oregon. Offers an exposition of some 
central results on irrational numbers, rather than an exhaustive treatment 
of the subject. The main emphasis is on those aspects of irrational num- 
bers commonly associated with number theory and Diophantine approxi- 
mation. Number 11 in the series of the Carus Mathematical Monographs. 
1956. 167 pages. $3.00. 


PRINCIPLES and TECHNIQUES of APPLIED MATHEMATICS 
By BERNARD FRIEDMAN, New York University. Takes a significant 
step in bridging the gap between pure and applied mathematics. It (1) 
demonstrates how the abstract theory of linear operators can be used to 
unify the techniques of applied mathematics; and (2) explains specific 


techniques which can be used to obtain explicit solutions of partial differ- 
ential equations. A publication in the Wiley Applied Mathematics Series, 
I. S. Sokolnikoff, Editor. 1956. 315 pages. $8.00. 


DIFFERENTIAL EQUATIONS, 3rd Edition 

By HARRY W. REDDICK and DONALD E. KIBBEY, both of 
Syracuse University. More than half rewritten, the third edition has new 
features geared to modern teaching requirements. Among these are a 
chapter on partial differential equations, a new section on the adjoint 
equation, and many fresh problems integrated with the discussions. 
Explanations are clear and rigorous; problems illustrate principles and 
stress applications. 1956. 304 pages. $4.50. 


STATISTICAL ANALYSIS of STATIONARY TIME SERIES 
By ULF GRENANDER, University of Stockholm; and MURRAY 
ROSENBLATT, Indiana University. Fulfills two major purposes: (1) it 
introduces theoretical statisticians to an approach to time series analysis 
that is essentially different from most of the techniques used by analysts 
in the past; and (2) it presents a unified treatment of modern methods 


that are being used increasingly in the physical sciences and technology. 
1957. 300 pages. $11.00. 


University of Toronto Press, Toronto, Ontario 
In Canada: 


Renouf Publishing Company, Montreal, Quebec 














Recent 


OXFORD PUBLICATIONS 


DIFFERENTIAL CALCULUS. By W.L. Ferrar. 306 pp., 
21 text-figures. $4.25 


The book is divided into three sections, on the foundations of the differential 
calculus, on functions of one variable, and on functions of two or more 
variables. 


ORDINARY NON-LINEAR DIFFERENTIAL EQUATIONS 
IN ENGINEERING AND PHYSICAL SCIENCES. By N. W. 
McLachlan. Second Edition. 282 pp., 122 text-figures. $5.25 
A new and enlarged edition of a popular text-book that includes additional 
material on stability criteria. 

AN INTRODUCTION TO LINEAR ALGEBRA. By L. Mirsky. 
445 pp. $5.25 


Applications to geometry are stressed throughout the book; and such topics 
as rotation, reduction of quadrics to principal axes, and classification of 
quadrics are treated in some detail. 


THE MATHEMATICS OF DIFFUSION. By J. Crank. 
355 pp., 120 text-figures. $7.50 
This book is written for those who are interested in the mathematical work 
which has been done in diffusion within the last ten years. 

MATHEMATICAL ANALYSIS. By D. A. Quadling. 272 pp., 
51 text-figures. $3.75 


The book makes the transition from methods of elementary calculus to those 
applicable to university courses on mathematical analysis. 


A complete list of Oxford Books on Mathematics 


is available on application 


OXFORD UNIVERSITY PKESS 


480 University Avenue Toronto 2 




















